The Install and Upgrade Framework (IUF) provides commands which install, upgrade, and deploy products on systems managed by CSM. IUF capabilities are described in detail in the IUF section of the Cray System Management Documentation. The initial install and upgrade workflows described in the HPE Cray EX System Software Stack Installation and Upgrade Guide for CSM (S-8052) detail when and how to use IUF with a new release of SAT or any other HPE Cray EX product.
This document does not replicate install, upgrade, or deployment procedures detailed in the Cray System Management Documentation. This document provides details regarding software and configuration content specific to SAT which is needed when installing, upgrading, or deploying a SAT release. The Cray System Management Documentation will indicate when sections of this document should be referred to for detailed information.
IUF will perform the following tasks for a release of SAT.
deliver-product
stage:
update-vcs-config
stage:
update-cfs-config
stage:
prepare-images
stage:
management-nodes-rollout
stage:
IUF uses a variety of CSM and SAT tools when performing these tasks. The IUF section of the Cray System Management Documentation describes how to use these tools directly if it is desirable to use them instead of IUF.
This section describes SAT details that an administrator must be aware of before running IUF stages. Entries are prefixed with Information if no administrative action is required or Action if an administrator needs to perform tasks outside of IUF.
Information: This stage is only run if a VCS working branch is specified for SAT. By default, SAT does not create or specify a VCS working branch.
Information: This stage only applies to the management configuration and not to the managed configuration.
Information: This stage only applies to management images and not to managed images.
After installing SAT with IUF, complete the following SAT configuration procedures before using SAT:
...
) in shell output indicate omitted lines.x.y.z
with the version of the SAT product stream
being installed.To run SAT commands on the manager NCNs, first set up authentication to the API gateway. For more information on authentication types and authentication credentials, see SAT Command Authentication.
The admin account used to authenticate with sat auth
must be enabled in
Keycloak and must have its assigned role set to admin. For more information
on Keycloak accounts and changing Role Mappings, refer to both Configure Keycloak
Account and Create Internal User Accounts in the Keycloak Shasta Realm in
the Cray System Management Documentation.
sat
CLI has been installed following the IUF
section of the
Cray System Management Documentation.The following is the procedure to globally configure the username used by SAT and authenticate to the API gateway.
(ncn-m001#
) Generate a default SAT configuration file if one does not exist.
sat init
Example output:
Configuration file "/root/.config/sat/sat.toml" generated.
Note: If the configuration file already exists, it will print out the following error.
ERROR: Configuration file "/root/.config/sat/sat.toml" already exists.
Not generating configuration file.
Edit ~/.config/sat/sat.toml
and set the username option in the api_gateway
section of the configuration file.
username = "crayadmin"
(ncn-m001#
) Run sat auth
. Enter the password when prompted.
sat auth
Example output:
Password for crayadmin:
Succeeded!
(ncn-m001#
) Other sat
commands are now authenticated to make requests to the API gateway.
sat status
Generate S3 credentials and write them to a local file so the SAT user can access S3 storage. In order to use the SAT S3 bucket, the System Administrator must generate the S3 access key and secret keys and write them to a local file. This must be done on every Kubernetes control plane node where SAT commands are run.
SAT uses S3 storage for several purposes, most importantly to store the
site-specific information set with sat setrev
(see Set System Revision
Information).
(ncn-m001#
) Ensure the files are readable only by root
.
touch /root/.config/sat/s3_access_key \
/root/.config/sat/s3_secret_key
chmod 600 /root/.config/sat/s3_access_key \
/root/.config/sat/s3_secret_key
(ncn-m001#
) Write the credentials to local files using kubectl
.
kubectl get secret sat-s3-credentials -o json -o \
jsonpath='{.data.access_key}' | base64 -d > \
/root/.config/sat/s3_access_key
kubectl get secret sat-s3-credentials -o json -o \
jsonpath='{.data.secret_key}' | base64 -d > \
/root/.config/sat/s3_secret_key
Verify the S3 endpoint specified in the SAT configuration file is correct.
(ncn-m001#
) Get the SAT configuration file’s endpoint value.
Note: If the command’s output is commented out, indicated by an initial #
character, the SAT configuration will take the default value – "https://rgw-vip.nmn"
.
grep endpoint ~/.config/sat/sat.toml
Example output:
# endpoint = "https://rgw-vip.nmn"
(ncn-m001#
) Get the sat-s3-credentials
secret’s endpoint value.
kubectl get secret sat-s3-credentials -o json -o \
jsonpath='{.data.s3_endpoint}' | base64 -d | xargs
Example output:
https://rgw-vip.nmn
Compare the two endpoint values.
If the values differ, change the SAT configuration file’s endpoint value to match the secret’s.
(ncn-m001#
) Copy SAT configurations to each manager node on the system.
for i in ncn-m002 ncn-m003; do echo $i; ssh ${i} \
mkdir -p /root/.config/sat; \
scp -pr /root/.config/sat ${i}:/root/.config; done
Note: Depending on how many manager nodes are on the system, the list of
manager nodes may be different. This example assumes three manager nodes, where
the configuration files must be copied from ncn-m001
to ncn-m002
and
ncn-m003
. Therefore, the list of hosts above is ncn-m002
and ncn-m003
.
If installing SAT on a multi-tenant system, the tenant name can be configured at this point. For more information, see Configure multi-tenancy.
HPE service representatives use system revision information data to identify systems in support cases.
(ncn-m001#
) Set System Revision Information.
Run sat setrev
and follow the prompts to set the following site-specific values:
Tip: For “System type”, a system with any liquid-cooled components should be considered a liquid-cooled system. In other words, “System type” is EX-1C.
sat setrev
Example output:
--------------------------------------------------------------------------------
Setting: Serial number
Purpose: System identification. This will affect how snapshots are
identified in the HPE backend services.
Description: This is the top-level serial number which uniquely identifies
the system. It can be requested from an HPE representative.
Valid values: Alpha-numeric string, 4 - 20 characters.
Type: <class 'str'>
Default: None
Current value: None
--------------------------------------------------------------------------------
Please do one of the following to set the value of the above setting:
- Input a new value
- Press CTRL-C to exit
...
Verify System Revision Information.
(ncn-m001#
) Run sat showrev
and verify the output shown in the “System Revision Information table.”
sat showrev
Example table output:
################################################################################
System Revision Information
################################################################################
+---------------------+---------------+
| component | data |
+---------------------+---------------+
| Company name | HPE |
| Country code | US |
| Interconnect | Sling |
| Product number | R4K98A |
| Serial number | 12345 |
| Site name | HPE |
| Slurm version | slurm 20.02.5 |
| System description | Test System |
| System install date | 2021-01-29 |
| System name | eniac |
| System type | EX-1C |
+---------------------+---------------+
################################################################################
Product Revision Information
################################################################################
+--------------+-----------------+------------------------------+------------------------------+
| product_name | product_version | images | image_recipes |
+--------------+-----------------+------------------------------+------------------------------+
| csm | 0.8.14 | cray-shasta-csm-sles15sp1... | cray-shasta-csm-sles15sp1... |
| sat | 2.0.1 | - | - |
| sdu | 1.0.8 | - | - |
| slingshot | 0.8.0 | - | - |
| sma | 1.4.12 | - | - |
+--------------+-----------------+------------------------------+------------------------------+
################################################################################
Local Host Operating System
################################################################################
+-----------+----------------------+
| component | version |
+-----------+----------------------+
| Kernel | 5.3.18-24.15-default |
| SLES | SLES 15-SP2 |
+-----------+----------------------+