This section ensures the product content is loaded onto the system and available for later steps in the workflow.
process-media and pre-install-check stagescustomizations.yamldeliver-product stageIf this IUF procedure is not part of an upgrade from CSM 1.6 to CSM 1.7, then this section should be skipped.
- If this IUF procedure is not part of an upgrade from CSM 1.6 to CSM 1.7, then this section should be skipped.
- Rack Resiliency is experimental.
Rack Resiliency is new in CSM 1.7. It is disabled by default and it cannot change between enabled and disabled later. Administrators are advised to take time to determine whether or not they wish to use this feature. See Rack Resiliency for details.
If an administrator does not wish to enable the Rack Resiliency feature, then the rest of this section can be skipped. Otherwise, follow these steps to enable (and optionally customize) Rack Resiliency.
(ncn-mw#) Retrieve the customizations.yaml file.
TMPDIR=$(mktemp -d -p ~) &&
kubectl get secrets -n loftsman site-init -o jsonpath='{.data.customizations\.yaml}' \
| base64 -d > "${TMPDIR}/customizations.yaml" \
&& echo "${TMPDIR}/customizations.yaml"
Example output:
/root/tmp.iM4FrDrJEJ/customizations.yaml
(ncn-mw#) Enable the feature in customizations.yaml.
yq write -i "${TMPDIR}/customizations.yaml" \
'spec.kubernetes.services.rack-resiliency.enabled' "true"
(ncn-mw#) Optionally, set custom zone name prefixes.
See Zone names for details on reasons for doing this and restrictions on names. This is optional; prefixes are not required. However, prefixes cannot be changed, set, or removed later.
Optionally, set a site-specific Kubernetes zone prefix.
In the following command, replace
k8s-prefix-stringwith the desired Kubernetes zone prefix.
yq write -i "${TMPDIR}/customizations.yaml" \
'spec.kubernetes.services.rack-resiliency.k8s_zone_prefix' "k8s-prefix-string"
Optionally, set a site-specific Ceph zone prefix.
In the following command, replace
ceph-prefix-stringwith the desired Ceph zone prefix.
yq write -i "${TMPDIR}/customizations.yaml" \
'spec.kubernetes.services.rack-resiliency.ceph_zone_prefix' "ceph-prefix-string"
(ncn-mw#) Update the site-init secret in the Kubernetes cluster.
kubectl delete secret -n loftsman site-init \
&& kubectl create secret -n loftsman generic site-init \
--from-file="${TMPDIR}/customizations.yaml"
Expected output:
secret/site-init created
(ncn-mw#) Confirm that the fields are set to the desired values.
kubectl get secrets -n loftsman site-init \
-o jsonpath='{.data.customizations\.yaml}' \
| base64 -d | yq r - 'spec.kubernetes.services.rack-resiliency'
Example output (in a case where only the Ceph zone prefix was set):
enabled: true
ceph_zone_prefix: my-ceph-prefix
process-media and pre-install-check stagesThe “Install and Upgrade Framework” section of each individual product’s installation document may contain special actions that need to be performed outside of IUF for a stage. The “IUF Stage Documentation Per Product”
section of the HPE Cray EX System Software Stack Installation and Upgrade Guide for CSM (S-8052) provides a table that summarizes which product documents contain information or actions for the process-media or pre-install-check stages.
Refer to that table and any corresponding product documents before continuing to the next step.
Invoke iuf run with activity identifier ${ACTIVITY_NAME} and use -e to execute the process-media and pre-install-check stages. Perform the upgrade
using product content found in ${MEDIA_DIR}.
(ncn-m001#) Execute the process-media and pre-install-check stages.
iuf -a ${ACTIVITY_NAME} -m "${MEDIA_DIR}" run -e pre-install-check
IMPORTANT* If upgrading CSM manually, ensure that thedocs-csm-latest.noarch.rpmandlibcsm-latest.noarch.rpmRPMs are available at path/root/<rpm>before executing the above command.
Once this step has completed:
${MEDIA_DIR}${MEDIA_DIR}process-media and pre-install-check stagescustomizations.yamlNOTE This subsection is optional and can be skipped if upgrading only CSM through IUF.
Some products require modifications to the customizations.yaml file before executing the deliver-product stage. Currently, this is limited to the Slurm and PBS Workload Manager (WLM) products, CSM Diags, and the UAN product. Refer to the
“Install and Upgrade Framework” section of the Slurm, PBS, CSM Diags, and UAN product documents to determine the actions that need to be performed to update customizations.yaml.
Once this step has completed:
customizations.yaml file has been updated per product documentation.For creating site_vars.yaml in admin directory, refer to Populate admin directory with files defining site preferences.
deliver-product stageThe “Install and Upgrade Framework” section of each individual product’s installation document may contain special actions that need to be performed outside of IUF for a stage. The “IUF Stage Documentation Per Product”
section of the HPE Cray EX System Software Stack Installation and Upgrade Guide for CSM (S-8052) provides a table that summarizes which product documents contain information or actions for the deliver-product stage.
Refer to that table and any corresponding product documents before continuing to the next step.
Invoke iuf run with activity identifier ${ACTIVITY_NAME} and use -r to execute the deliver-product stage. Perform the upgrade using product content found in ${MEDIA_DIR}.
Additional arguments are available to control the behavior of the deliver-product stage (for example, -rv). See the deliver-product stage documentation
for details and adjust the example below if necessary.
NOTE When installing USS 1.1 or higher, select either Slurm or PBS Pro Products to use on the system before running this stage. This should be specified in site_vars.yaml.
For more information, see the deliver-product stage details in the “Install and Upgrade Framework” section of the HPE Cray Supercomputing User Services Software Administration Guide: CSM on HPE Cray Supercomputing EX Systems (S-8063).
(ncn-m001#) Execute the deliver-product stage. Use site variables from the site_vars.yaml file found in ${ADMIN_DIR} and recipe variables from the product_vars.yaml file found in ${ADMIN_DIR}.
iuf -a ${ACTIVITY_NAME} -m "${MEDIA_DIR}" run --site-vars \
"${ADMIN_DIR}/site_vars.yaml" -bpcd "${ADMIN_DIR}" -r deliver-product
Run upload-rebuild-templates.sh to ensure the correct CSM product versions will be used by IUF now that all product artifacts have been uploaded.
(ncn-m001#) Execute the upload-rebuild-templates.sh script.
/usr/share/doc/csm/workflows/scripts/upload-rebuild-templates.sh
Once this step has completed:
${MEDIA_DIR} has been uploaded to the systemdeliver-product stageNOTE This subsection is optional and can be skipped if upgrading only CSM through IUF.
NOTE This subsection is optional and can be skipped if third-party GPU and/or programming environment software is not needed.
Some products provide instructions for delivering third-party content to the system outside of IUF. If this content is desired, refer to the following documentation for instructions and execute the procedures before continuing with the workflow.
gpu-nexus-tool script to upload third-party GPU software to Nexus.
The GPU software is used later in the workflow when creating CFS configurations and building compute and application node images.install-3p.sh and cpe-custom-img.sh scripts to upload third-party programming environment software to Nexus and build images. The programming environment
software is used later in the workflow when creating CPE configurations.Once this step has completed:
If performing an initial install or an upgrade of non-CSM products only, return to the Install or upgrade additional products with IUF workflow to continue the install or upgrade.
If performing an upgrade that includes upgrading CSM and additional products with IUF, return to the Upgrade CSM and additional products with IUF workflow to continue the upgrade.
If performing an upgrade that includes upgrading only CSM, return to the Upgrade only CSM through IUF workflow to continue the upgrade.