The Install and Upgrade Framework (IUF) provides commands which install, upgrade, and deploy products on systems managed by CSM. IUF capabilities are described in detail in the IUF section of the Cray System Management Documentation. The initial install and upgrade workflows described in the HPE Cray EX System Software Stack Installation and Upgrade Guide for CSM (S-8052) detail when and how to use IUF with a new release of UAN or any other HPE Cray EX product.
This document does not replicate the install, upgrade, or deployment procedures detailed in the IUF section of the Cray System Management Documentation. This document provides details regarding software and configuration content specific to UAN which may be needed when installing, upgrading, or deploying a UAN release. The IUF section of the Cray System Management Documentation will indicate when sections of this document must be seen for detailed information.
IUF will perform the following tasks for a release of the HPE Cray Supercomputing UAN product software.
process-media
stage:
pre-install-check
stage:
deliver-product
stage:
update-vcs-config
stage:
update-cfs-config
stage:
prepare-images
stage:
managed-nodes-rollout
stage:
IUF uses a variety of CSM and SAT tools when performing these tasks. The IUF section of the Cray System Management Documentation describes how to use these tools directly if it is desirable to use them instead of IUF.
IUF uses the following files to drive the install or upgrade of UAN. These files are provided by the hpc-csm-software-recipe
VCS repository.
product_vars.yaml
(Required)
working_branch
variables for the products. This file is in a directory under /etc/cray/upgrade/csm/
. For example, /etc/cray/upgrade/csm/admin
.iuf
command line using the IUF recipe vars (-rv
) option. For example, -rv /etc/cray/upgrade/csm/admin
.site_vars.yaml
(Required)
product_vars.yaml
and defines the site VCS branching strategy. It may be placed in any directory under /etc/cray/upgrade/csm
.iuf
command line using the IUF site vars (-sv
) option. For example, -sv /etc/cray/upgrade/csm/admin/site_vars.yaml
.compute-and-uan-bootprep.yaml
(Required)
/etc/cray/upgrade/csm/bootprep
directory and defines variables for the following tasks:
iuf
command line using the IUF bootprep-config-managed (-bc
) options. For example, -bc /etc/cray/upgrade/csm/bootprep/compute-and-uan-bootprep.yaml
.This section describes any UAN details that an administrator must be aware of before executing IUF stages. Entries are prefixed with Information if no administrative action is required or Action if an administrator must perform tasks outside of IUF.
Action: Before executing this stage, the administrator must ensure that the UAN product tar file is in a media directory under /etc/cray/upgrade/csm/
. When more than one product is being installed, place all the product tar files in the same directory.
Action: Before executing this stage, the administrator must perform the prerequisites found in Configuring a UAN for K3s when UAIs on UANs is enabled.
Action: Before executing this stage, the administrator must ensure the IUF site variables file (see iuf -sv SITE_VARS
) is updated to reflect site preferences, including the wanted VCS branching configuration. The update-vcs-config
stage will use the branching configuration when modifying UAN branches in VCS.
Action: Before executing this stage, any site-local UAN configuration changes must be made so that the following stages execute using the wanted UAN configuration values. See the Basic UAN Configuration section of this documentation for UAN configuration content details. The Prepare for UAN Product Installation section is required for fresh installation scenarios.
The procedures in this guide assume that the HPE Cray Supercomputing EX system has dedicated UANs. If the HPE Cray Supercomputing EX system does not have dedicated UANs, skip the steps for installing and configuring them.
The following subsections describe most of the UAN content installed and configured on the system by IUF. The new version of UAN (2.6.XX) and its artifacts will be displayed in the CSM product catalog alongside any previously released version of UAN and its artifacts.
UAN provides configuration content in the form of Ansible roles and plays. This content is uploaded to a VCS repository in a branch with a specific UAN version number. That release number, such as (2.6.XX) distinguishes it from any previously released UAN configuration content. This content is described in detail in the Basic UAN Configuration section.
For application nodes based on COS, the COS compute image is used as the base application node image and two COS CFS layers are required. The first COS CFS layer runs the cos-application.yml
Ansible playbook and ensures that the COS content is applied as part of the image customization and node personalization processes. This COS CFS layer must precede the UAN CFS layer in the UAN CFS configuration. A second COS CFS layer running the Ansible playbook, cos-application-after.yml
, runs after the UAN CFS layer of the UAN CFS configuration. This second COS CFS layer ensures that the application node initrd is rebuilt and that any customer-defined filesystems are configured.
Input files for sat bootprep
are provided in the hpc-csm-software-recipe
VCS repository and include COS components in the CFS configuration, node image, and BOS session template definitions. The compute-and-uan-bootprep.yaml
input file is used for compute and application nodes.
The following instructions describe how to set the root password for UAN/Application nodes.
Obtain the HashiCorp Vault root token.
ncn-m001# kubectl get secrets -n vault cray-vault-unseal-keys \
-o jsonpath='{.data.vault-root}' | base64 -d; echo
Log into the HashiCorp Vault pod.
ncn-m001# kubectl exec -itn vault cray-vault-0 -c vault -- sh
After you are attached to the pod’s shell, log into vault and read the secret/uan
key by executing the following commands. If the secret is empty, “No value found at secret/uan” will be displayed.
pod# export VAULT_ADDR=http://cray-vault:8200
pod# vault login
pod# vault read secret/uan
If no value is found for this key, complete the following steps in another shell on the NCN management node.
a. Generate the password HASH for the root user. Replace ‘PASSWORD’ with a chosen root password.
ncn-m001# openssl passwd -6 -salt $(< /dev/urandom tr -dc A-Z-a-z-0-9 | head -c4) PASSWORD
b. Take the HASH value returned from the previous command and enter the following in the vault pod’s shell. Instead of HASH, use the value returned from the previous step. You must escape the HASH value with the single quote to preserve any special characters that are part of the HASH value. If you previously have exited the pod, repeat the previous Steps 1-3; there is no need to perform the vault read since the content is empty.
pod# vault write secret/uan root_password='HASH'
c. Verify that the new hash value is stored.
pod# vault read secret/uan
...
pod# exit
Optional Write any uan_ldap sensitive data, such as the ldap_default_authtok
value, to the HashiCorp Vault.
The vault login command will request a token. That token value is the output of the Step 1. The vault read secret/uan_ldap
command verifies that the uan_ldap
data was stored correctly. Any values stored here will be written to the UAN /etc/sssd/sssd.conf
file in the [domain]
section by CFS.
This example shows storing a value for ldap_default_authtok
. If more than one variable must be stored, they must be written in space separated key=value
pairs on the same vault write secret/uan_ldap
command line.
ncn-m001# kubectl exec -it -n vault cray-vault-0 -- sh
export VAULT_ADDR=http://cray-vault:8200
vault login
vault write secret/uan_ldap ldap_default_authtok='TOKEN'
vault read secret/uan_ldap
The UAN product provides an Application Node image. This image is based on SUSE Linux Enterprise High Performance Computing 15 (SLE HPC 15) and uses a “stock” (that is, not customized for HPE Cray systems) kernel. Customers can use this image on Application nodes that do not require any COS compatibility as UAN does. Unlike the default UAN image, this Application Node image is not based on the COS image. However, like the UAN default image, it is uploaded to IMS as part of the installation process.
Customers must finish the installation or upgrade of the UAN product before booting an Application Node with SLE HPC 15 provided by that UAN product release. See Booting an Application Node with a SLES Image (Technical Preview) for more information about this image and instructions on deploying it to Application Nodes.
UAN provides RPMs used on UAN nodes. The RPMs are uploaded to Nexus as part of the installation process.
The following Nexus raw repositories are created:
The following Nexus group repositories are created and reference the preceding Nexus raw repos.
The uan-2.6-sle-15sp4 and uan-2.6-sle-15sp3 Nexus group repositories are used when building UAN node images and are accessible on UAN nodes after boot.