Beginning with UAN 2.6, the procedures described here are automatically performed by IUF during installation and upgrade of the HPE Cray Supercomputing UAN product. See Install or Upgrade UAN for details. The procedures shown here are for cases when a new image is needed after the UAN product is installed or upgraded and the product release is prior to 2.6. For release 2.6 and newer, perform Build a New UAN Image Using a COS Recipe for these cases.
This procedure updates the configuration management git repository to match the installed version of the HPE Cray Supercomputing UAN product. That updated configuration is then used to create UAN boot images and a BOS session template.
UAN specific configuration, and other required configurations related to UANs are covered in this topic. See product-specific documentation for further information on configuring other HPE products (for example, workload managers and the HPE Cray Programming Environment) that may be configured on the UANs.
The workflow for manually creating images to boot UANs is:
sat bootprep
will use to automatically create CFS configurations, IMS images, and BOS session templates for UANs.sat bootprep
to generate all the artifacts a BOS session requires to boot UANs.Replace PRODUCT_VERSION
and CRAY_EX_HOSTNAME
in the example commands in this procedure with the current UAN product version installed (See Step 1) and the hostname of the HPE Cray Supercomputing EX system, respectively.
Obtain the artifact IDs and other information from the cray-product-catalog
Kubernetes ConfigMap. Record the following information:
clone_url
import_branch
valueUpon successful installation of the UAN product, the UAN configuration is cataloged in this ConfigMap. This information is required for this procedure.
PRODUCT_VERSION
will be replaced by a numbered version string, such as 2.1.7
or 2.3.0
.
ncn-m001# kubectl get cm -n services cray-product-catalog -o json | jq -r .data.uan
PRODUCT_VERSION:
configuration:
clone_url: https://vcs.CRAY_EX_HOSTNAME/vcs/cray/uan-config-management.git
commit: 6658ea9e75f5f0f73f78941202664e9631a63726
import_branch: cray/uan/PRODUCT_VERSION
import_date: 2021-02-02 19:14:18.399670
ssh_url: git@vcs.CRAY_EX_HOSTNAME:cray/uan-config-management.git
Obtain the password for the crayvcs
user from the Kubernetes secret for use in the next command.
ncn-m001# VCS_USER=$(kubectl get secret -n services vcs-user-credentials --template={{.data.vcs_username}} | base64 --decode)
VCS_PASS=$(kubectl get secret -n services vcs-user-credentials --template={{.data.vcs_password}} | base64 --decode)
Clone the UAN configuration management repository. Replace CRAY_EX_HOSTNAME in the clone url with api-gw-service-nmn.local when cloning the repository.
The repository is in the VCS/Gitea service and the location is reported in the cray-product-catalog Kubernetes ConfigMap in the configuration.clone_url
key. The CRAY_EX_HOSTNAME from the clone_url
is replaced with api-gw-service-nmn.local
in the command that clones the repository.
ncn-m001# git clone https://$VCS_USER:$VCS_PASS@api-gw-service-nmn.local/vcs/cray/uan-config-management.git
. . .
ncn-m001# cd uan-config-management && git checkout cray/uan/PRODUCT_VERSION && git pull
Branch 'cray/uan/PRODUCT_VERSION' set up to track remote branch 'cray/uan/PRODUCT_VERSION' from 'origin'.
Already up to date.
Create a branch using the imported branch from the installation to customize the UAN image.
This branch name will be reported in the cray-product-catalog
Kubernetes ConfigMap in the configuration.import_branch
key under the UAN section. The format is cray/uan/PRODUCT_VERSION. In this guide, an integration-PRODUCT_VERSION
branch is used for examples to comply with IUF defaults, but the name can be any valid git branch name configured to be used by IUF.
Modifying the cray/uan/PRODUCT_VERSION branch that the UAN product installation created is not allowed by default.
ncn-m001# git checkout -b integration-PRODUCT_VERSION && git merge cray/uan/PRODUCT_VERSION
Switched to a new branch 'integration-PRODUCT_VERSION'
Already up to date.
Apply any site-specific customizations and modifications to the Ansible configuration for the UAN nodes and commit the changes.
The default Ansible play to configure UAN nodes is site.yml
in the base of the uan-config-management
repository. The roles that are executed in this play allow for custom configuration as required for the system.
Consult the individual Ansible role README.md
files in the uan-config-management repository roles directory to configure individual role variables. Roles prefixed with uan_
are specific to UAN configuration and include network interfaces, disk, LDAP, software packages, and message of the day roles.
NOTE: Admins must ensure the uan_can_setup
variable is set to the correct value for the site. This variable controls how the nodes are configured for user access. When uan_can_setup
is yes
, user access is over the CAN
or CHN
, based on the BICAN System Default Route setting in SLS. When uan_can_setup
is no
, the Admin must configure the user access interface and default route. See Configure Interfaces on UANs
Warning: Never place sensitive information such as passwords in the git repository.
The following example shows how to add a vars.yml
file containing site-specific configuration values to the Application_UAN
group variable location.
These and other Ansible files do not necessarily need to be modified for UAN image creation. See Basic UAN Configuration for instructions for site-specific UAN configuration, including CAN/CHN configuration.
ncn-m001# vim group_vars/Application_UAN/vars.yml
ncn-m001# git add group_vars/Application_UAN/vars.yml
ncn-m001# git commit -m "Add vars.yml customizations"
[integration-PRODUCT_VERSION ecece54] Add vars.yml customizations
1 file changed, 1 insertion(+)
create mode 100644 group_vars/Application_UAN/vars.yml
Push the changes to the repository using the proper credentials, including the password obtained previously.
ncn-m001# git push --set-upstream origin integration-PRODUCT_VERSION
Username for 'https://api-gw-service-nmn.local': crayvcs
Password for 'https://crayvcs@api-gw-service-nmn.local':
. . .
remote: Processed 1 references in total
To https://api-gw-service-nmn.local/vcs/cray/uan-config-management.git
* [new branch] integration-PRODUCT_VERSION -> integration-PRODUCT_VERSION
Branch 'integration-PRODUCT_VERSION' set up to track remote branch 'integration-PRODUCT_VERSION' from 'origin'.
The configuration parameters have been stored in a branch in the UAN git repository. The next phase of the process uses sat bootprep
to handle creating the CFS configurations, IMS images, and BOS session templates for UANs.
With Shasta Admin Toolkit (SAT) version 2.2.16
and later, HPE recommends that administrators create an input file for use with sat bootprep
.
A sat bootprep
input file will have three sections:
configurations
: specifies each layer to be included in the CFS configuration for the UAN images for image customization and node personalization.images
: specifies the IMS images to create for UAN nodes.session_templates
: creates BOS session templates. This section references the named IMS image that sat bootprep
generates, as well as a CFS configuration.These sections create CFS configurations, IMS images, and BOS session templates respectively. Each section may have multiple elements to create more than one CFS, IMS, or BOS artifact. The format is similar to the input files for CFS, IMS, and BOS, but SAT will automate the process with fewer steps. Follow the subsections below to create a UAN bootprep input file.
See also HPE Cray EX System Software Stack Installation and Upgrade Guide for CSM (S-8052) for further information on configuring other HPE products, as this procedure documents only the required configuration of the UAN.
Verify that installed version of SAT is 2.2.16
or later.
ncn-m001# sat showrev --products --filter 'product_name="sat"'
Obtain the version of each product that will be included in a CFS configuration layer.
ncn-m001# sat showrev --products --filter 'product_name="PRODUCT_NAME"'
The example sat bootprep
input file in this procedure includes the following products:
- slingshot-host-software
- cos
- csm
- uan
Your site may require additional products in the input file.
Record or save the list of COS image recipes returned in the previous step.
You will select one of these recipes as the base for the UAN image in a later step.
Create a sat bootprep
input file.
ncn-m001# touch uan-bootprep.yaml
Open the empty sat bootprep
input file in an editor.
Add the CFS configuration content.
The Slingshot Host Software CFS layer must be listed first. This layer is required as the UAN layer will attempt to install DVS and Lustre packages that require SHS be installed first. The correct playbook for Cassini or Mellanox must also be specified. Consult the Slingshot Host Software documentation for more information.
Beginning with UAN version 2.6.0
, CFS configuration roles which are provided by COS are now defined as two separate COS configuration layers as shown in the following example. Prior to UAN version 2.6.0
, these roles were included in the UAN configuration layer. Separating these roles into COS layers allows COS updates to be independent from UAN updates.
The following example creates a CFS configuration named uan-config
:
Example:
configurations:
- name: uan-config
layers:
- name: shs-mellanox_install-integration
playbook: shs_mellanox_install.yml
product:
name: slingshot-host-software
version: 2.0.0
branch: integration
# - name: shs-cassini_install-integration
# playbook: shs_cassini_install.yml
# product:
# name: slingshot-host-software
# version: 2.0.0
# branch: integration
- name: cos-application-integration
playbook: cos-application.yml
product:
name: cos
version: 2.5
- name: csm-packages-integration
playbook: csm_packages.yml
product:
name: csm
version: 1.4
- name: uan-set-nologin
playbook: set_nologin.yml
product:
name: uan
version: 2.6.0
branch: integration-PRODUCT_VERSION
- name: uan
playbook: site.yml
product:
name: uan
version: 2.6.0
branch: integration-PRODUCT_VERSION
... add configuration layers for other products here, if desired ...
- name: uan-rebuild-initrd
playbook: rebuild-initrd.yml
product:
name: uan
version: 2.6.0
branch: integration-PRODUCT_VERSION
- name: uan-unset-nologin
playbook: unset_nologin.yml
product:
name: uan
version: 2.6.0
branch: integration-PRODUCT_VERSION
Add the content for the UAN image, using an appropriate name to correctly identify the UAN image being built.
UAN images are built using the COS recipe, so this step specifies which image recipe to use based on what is provided by COS. The ims
section references the uan-config
CFS configuration so that CFS image customization will use that configuration along with the specified node groups.
The following example will create an IMS image with the name cray-shasta-uan-sles15sp3.x86_64-2.3.25
.
images:
- name: cray-shasta-uan-sles15sp3.x86_64-2.3.25
ims:
is_recipe: true
name: cray-shasta-compute-sles15sp3.x86_64-2.3.25
configuration: uan-config
configuration_group_names:
- Application
- Application_UAN
Add the content for the BOS session templates.
You may need to change the boot_sets
key uan
in the following example. If there are more than one boot_sets
in the session template, each key must be unique.
session_templates:
- name: uan-2.4.0
image: cray-shasta-uan-sles15sp3.x86_64-2.3.25
configuration: uan-config
bos_parameters:
boot_sets:
uan:
kernel_parameters: spire_join_token=${SPIRE_JOIN_TOKEN}
node_roles_groups:
- Application_UAN
sat bootprep
input YAML file and exit the editor.sat bootprep
Execute the sat bootprep
command to generate the configurations and artifacts needed to boot UANs.
This command may take some time as it will initiate IMS image creation and CFS image customization.
ncn-m001# sat bootprep run uan-bootprep.yaml
Modify the CFS layers or input file if necessary to successfully complete sat bootprep
.
If any artifacts are going to be overwritten, SAT will prompt for confirmation before overwriting them. This is useful when making CFS changes as SAT will automatically configure the layers to use the latest git commits if the branches are specified correctly.
Save the input file to a known location after sat bootprep
completes successfully.
You can use this input file to regenerate artifacts as changes are made or different product layers are added.