This section ensures product configuration has been defined, customized, and is available for later steps in the workflow.
This workflow uses ${ADMIN_DIR}
to retain files that define site preferences for IUF. ${ADMIN_DIR}
is defined separately from ${ACTIVITY_DIR}
and ${MEDIA_DIR}
based on the assumption that the files in ${ADMIN_DIR}
will be
used when performing future IUF operations unrelated to this workflow.
NOTE
The following steps assume ${ADMIN_DIR}
is empty. If this is not the case, i.e. ${ADMIN_DIR}
has been populated by previous IUF workflows, ensure the content in ${ADMIN_DIR}
is up to date with the latest content
provided by the HPC CSM Software Recipe release content being installed. This may involve merging new content provided in the latest branch of the hpc-csm-software-recipe
repository in VCS or provided in the files extracted from
the HPC CSM Software Recipe with the existing content in ${ADMIN_DIR}
.
update-vcs-config
stage
Change directory to ${ADMIN_DIR}
(ncn-m001#
) Change directory
cd ${ADMIN_DIR}
Copy the sat bootprep
and product_vars.yaml
files from the uncompressed HPC CSM Software Recipe distribution file in the media directory to the current directory.
(ncn-m001#
) Copy sat bootprep
and product_vars.yaml
files
cp "${MEDIA_DIR}"/hpc-csm-software-recipe-*/vcs/product_vars.yaml .
cp -r "${MEDIA_DIR}"/hpc-csm-software-recipe-*/vcs/bootprep .
(ncn-m001#
) Examine the contents of ${ADMIN_DIR}
to verify the expected content is present
find . -type f
Example output:
./bootprep/management-bootprep.yaml
./bootprep/compute-and-uan-bootprep.yaml
./product_vars.yaml
Edit the compute-and-uan-bootprep.yaml
and management-bootprep.yaml
files to account for any site deviations from the default values. For example:
slurm-site
CFS configuration layer and uncomment the pbs-site
CFS configuration layer in compute-and-uan-bootprep.yaml
if PBS is the preferred workload managergpu-{{recipe.version}}
CFS configuration layer and gpu-image
image definition in compute-and-uan-bootprep.yaml
if the system has GPU hardwarecompute-and-uan-bootprep.yaml
and management-bootprep.yaml
files for products that are not needed on the systemCreate a site_vars.yaml
file in ${ADMIN_DIR}
. This file will contain key/value pairs for any configuration changes that should override entries in the default
section of the HPE-provided product_vars.yaml
file.
There are comments at the top of the product_vars.yaml
file that describe the variables and related details. The following are a few examples of site_vars.yaml
changes:
default
section containing a network_type: "cassini"
entry to designate that Cassini is the desired Slingshot network type to be used when executing CFS configurations later in the workflowsuffix
entry to the default
section to append a string to the names of CFS configuration, image, and BOS session template artifacts created during the workflow to make them easy to identifyAdditional information on site_vars.yaml
files can be found in the Site and recipe variables and update-vcs-config
sections.
<create a site_vars.yaml
file with desired key/value pairs >
Ensure the site_vars.yaml
file contents are formatted correctly. The following text is an example for verification purposes only.
(ncn-m001#
) Display the contents of an example site_vars.yaml
file
cat site_vars.yaml
Example output:
default:
network_type: "cassini"
suffix: "-test01"
Ensure the expected files are present in the admin directory after performing the steps in this section.
(ncn-m001#
) Examine the contents of ${ADMIN_DIR}
to verify the expected content is present
find . -type f
Example output:
./bootprep/management-bootprep.yaml
./bootprep/compute-and-uan-bootprep.yaml
./product_vars.yaml
./site_vars.yaml
Once this step has completed:
${ADMIN_DIR}
is populated with product_vars.yaml
, site_vars.yaml
, and sat bootprep
input filesupdate-vcs-config
stageFor each product that uploaded Ansible configuration content to a configuration management VCS repository, the update-vcs-config
stage attempts to merge the pristine branch of the configuration management repository into a
corresponding customer working branch.
Understand the default branching scheme defined in product_vars.yaml
, which is typically integration-<x.y.z>
. Details are provided in the update-vcs-config stage documentation.
Create and configure site_vars.yaml
to properly define the customer branching strategy as well as any needed product-specific overrides and provide it as an argument when invoking iuf run
.
If the default branching scheme described above does not match the customer branching scheme used, use git
to perform a manual migration of VCS content to the default branching scheme before running the update-vcs-config
stage.
For example, if the customer branch integration
was previously used with the slingshot-host-software-2.0.0
release and slingshot-host-software-2.0.2
is now being installed with the default integration-<x.y.z>
branching
scheme, create the branch that IUF expects from the current integration
branch in the slingshot-host-software-config-management
repository:
(ncn-m001#
) Create an integration-2.0.0
branch from integration
to align with IUF expectations
ncn-m001:/mnt/admin/cfg/slingshot-host-software-config-management# git checkout integration
ncn-m001:/mnt/admin/cfg/slingshot-host-software-config-management# git pull
ncn-m001:/mnt/admin/cfg/slingshot-host-software-config-management# git branch integration-2.0.0
ncn-m001:/mnt/admin/cfg/slingshot-host-software-config-management# git checkout integration-2.0.0
ncn-m001:/mnt/admin/cfg/slingshot-host-software-config-management# git push
When the update-vcs-config
stage is run, IUF will now use the integration-2.0.0
branch as the starting point for merging because it adheres to the expected branching scheme.
If there are workarounds checked into VCS that modify the HPE-provided Ansible plays or roles and the workarounds are no longer needed in the new version of software being upgraded to, it is beneficial to revert the workarounds
prior to running the update-vcs-config
stage to avoid merge conflicts.
For example, if the following workaround was present in the Slurm integration-1.2.9
branch:
ncn-m001:/mnt/admin/cfg/slurm-config-management# git checkout integration-1.2.9
ncn-m001:/mnt/admin/cfg/slurm-config-management# git log
commit 133d5fc815aafd502d3aca07961524e4f9eab445 (origin/integration-1.2.9, integration-1.2.9)
Author: Joe Smith <joe.smith@example.com>
Date: Fri Mar 3 19:26:53 2023 +0000
Workaround for bug #234232
… and an upgrade to Slurm integration-1.2.10
is being performed, then a new working branch should be created and the workaround should be reverted from the new branch:
ncn-m001:/mnt/admin/cfg/slurm-config-management# git branch integration-1.2.10
ncn-m001:/mnt/admin/cfg/slurm-config-management# git checkout integration-1.2.10
ncn-m001:/mnt/admin/cfg/slurm-config-management# git revert 133d5fc815aafd502d3aca07961524e4f9eab445
ncn-m001:/mnt/admin/cfg/slurm-config-management# git push
When the update-vcs-config
stage is run, IUF will now use the integration-1.2.10
branch, and the merge conflict that would have occurred will be avoided as the workaround was reverted.
NOTE
Additional arguments are available to control the behavior of the update-vcs-config
stage, for example -rv
. See the update-vcs-config
stage
documentation for details and adjust the examples below if necessary.
The “Install and Upgrade Framework” section of each individual product’s installation document may contain special actions that need to be performed outside of IUF for a stage. The “IUF Stage Documentation Per Product”
section of the HPE Cray EX System Software Stack Installation and Upgrade Guide for CSM (S-8052) provides a table that summarizes which product documents contain information or actions for the update-vcs-config
stage.
Refer to that table and any corresponding product documents before continuing to the next step.
Invoke iuf run
with -r
to execute the update-vcs-config
stage. Use site variables from the site_vars.yaml
file found in ${ADMIN_DIR}
and recipe variables from the product_vars.yaml
file found in ${ADMIN_DIR}
.
(ncn-m001#
) Execute the update-vcs-config
stage.
iuf -a ${ACTIVITY_NAME} run --site-vars "${ADMIN_DIR}/site_vars.yaml" -bpcd "${ADMIN_DIR}" -r update-vcs-config
Once this step has completed:
update-vcs-config
stageSome products must be manually configured prior to the creation of CFS configurations and images. The “Install and Upgrade Framework” section of each individual product’s installation documentation contains instructions for product-specific configuration, if any. Major changes may also be documented in the HPE Cray Supercomputing User Services Software Administration Guide: CSM on HPE Cray EX Systems. The following highlights some of the areas that require manual configuration changes but is not intended to be a comprehensive list. Note that many of the configuration changes are only required for initial installation scenarios.
group_vars
(done for each product release)sdu setup
sat auth
sat setrev
Once this step has completed:
If performing an initial install or an upgrade of non-CSM products only, return to the Install or upgrade additional products with IUF workflow to continue the install or upgrade.
If performing an upgrade that includes upgrading CSM, return to the Upgrade CSM and additional products with IUF workflow to continue the upgrade.