This page describes the procedure to customize a management node image using the Configuration Framework Service (CFS) and the Image Management Service (IMS). The procedure has options for configuring Kubernetes master or worker nodes or Ceph storage nodes.
When performing an upgrade of CSM and/or additional HPE Cray EX software products, this procedure is not followed end-to-end, but certain portions of the procedure are referenced. When performing an install or upgrade, be sure to follow the appropriate procedures in Cray System Management Install or Upgrade CSM. This procedure is referenced from operational procedures that occur after CSM install or upgrade.
Determine the ID of the management node boot image in the Image Management Service (IMS). There are several alternatives to find the management node image ID to use.
Use this option to find the IMS image ID of the image which is configured to boot on a management node the next time it boots. This may not necessarily match what the management node is currently booted with. This procedure can be used whether the management node is booted or not.
This image is identified by examining the management node’s boot parameters in the Boot Script Service (BSS) as described in the procedure below.
(ncn-mw#
) Set NODE_XNAME
to the component name (xname) of the management node whose
boot image is to be used.
If customizing a Kubernetes master node image, get the xname of ncn-m001
.
NODE_XNAME=$(ssh ncn-m001 cat /etc/cray/xname)
echo "${NODE_XNAME}"
If customizing a Kubernetes worker node image, get the xname of ncn-w001
.
NODE_XNAME=$(ssh ncn-w001 cat /etc/cray/xname)
echo "${NODE_XNAME}"
If customizing a Ceph storage node image, get the xname of ncn-s001
:
For example:
NODE_XNAME=$(ssh ncn-s001 cat /etc/cray/xname)
echo "${NODE_XNAME}"
(ncn-mw#
) Extract the S3 path prefix for the image that will be used on the next boot of the chosen management
node.
This prefix corresponds to the IMS image ID of the boot image.
IMS_IMAGE_ID=$(cray bss bootparameters list --name "${NODE_XNAME}" --format json | \
jq -r '.[0].params' | \
sed 's#\(^.*[[:space:]]\|^\)metal[.]server=[^[:space:]]*/boot-images/\([^[:space:]]\+\)/rootfs.*#\2#')
echo "${IMS_IMAGE_ID}"
The output should be a UUID string. For example, 8f41cc54-82f8-436c-905f-869f216ce487
.
The command used in this substep is extracting the location of the NCN image from the
metal.server
boot parameter for the NCN in BSS. For more information on that parameter, seemetal.server
boot parameter.
Use this option to find the IMS image ID of the image which is currently booted on a management node. In this case, the image is identified by examining the kernel command-line parameters on the booted node.
NOTE: Do not use this option if either of the following are true:
(ncn-mw#
) Set the BOOTED_NODE
variable to the hostname of the management node that is booted using the image to
be modified.
For example,
ncn-w001
orncn-s002
.
BOOTED_NODE=ncn-<msw###>
(ncn-mw#
) Extract the S3 path prefix for the image used to boot the chosen management node.
This prefix corresponds to the IMS image ID of the boot image.
IMS_IMAGE_ID=$(ssh "${BOOTED_NODE}" sed \
"'s#\(^.*[[:space:]]\|^\)metal[.]server=[^[:space:]]*/boot-images/\([^[:space:]]\+\)/rootfs.*#\2#'" \
/proc/cmdline)
echo "${IMS_IMAGE_ID}"
The output should be a UUID string. For example, 8f41cc54-82f8-436c-905f-869f216ce487
.
The command used in this substep is extracting the location of the NCN image from the
metal.server
boot parameter in the/proc/cmdline
file on the booted NCN. For more information on that parameter, seemetal.server
boot parameter.
Use this option to find the IMS image ID of a base image provided by the CSM product. The CSM product provides two
images for management nodes. One image is for Kubernetes master and worker nodes. The other image is for Ceph storage
nodes. Starting with CSM 1.4.0, the CSM product adds these image IDs to the cray-product-catalog
Kubernetes ConfigMap.
These images can be used as a base for configuration if the administrator does not need to preserve any additional customizations on the images which are currently booted or assigned to be booted on management nodes.
(ncn-mw#
) Get the versions of CSM available in the cray-product-catalog
ConfigMap:
kubectl get configmap -n services cray-product-catalog -o jsonpath={.data.csm} | yq -j r - | jq -r 'keys[]'
This will print a list of the CSM release versions installed on the system. For example:
1.3.0
1.4.0
(ncn-mw#
) Set the desired CSM release version as an environment variable.
CSM_RELEASE=1.4.0
(ncn-mw#
) Get the IMS image ID of the image for Kubernetes nodes or Ceph storage nodes.
To get the IMS image ID of the Kubernetes image, use the following command:
IMS_IMAGE_ID=$(kubectl get configmap -n services cray-product-catalog -o jsonpath={.data.csm} | yq -j r - |
jq -r '.["'${CSM_RELEASE}'"].images | to_entries |
map(select(.key | startswith("secure-kubernetes"))) | first | .value.id')
echo "${IMS_IMAGE_ID}"
To get the IMS image ID of the Ceph storage nodes, use the following command:
IMS_IMAGE_ID=$(kubectl get configmap -n services cray-product-catalog -o jsonpath={.data.csm} | yq -j r - |
jq -r '.["'${CSM_RELEASE}'"].images | to_entries |
map(select(.key | startswith("secure-storage-ceph"))) | first | .value.id')
echo "${IMS_IMAGE_ID}"
If the CFS configuration to be used for management node image customization has already been identified, skip this step and and proceed to Run the CFS image customization session.
A CFS configuration that applies to management nodes and works for both image customization and post-boot personalization is created as part of the CSM install and upgrade procedures. Generally, this is the configuration that should be used to perform management node image customization. The first option below shows how to find the name of that CFS configuration. The second option describes how to manually create a new CFS configuration that contains just the minimum CSM layers.
Use this option to find the current CFS configuration applied to management nodes.
The following procedure describes how to find the CFS configuration applied to the management nodes.
(ncn-mw#
) Use sat status
to get the desired CFS configuration of all the management nodes.
sat status --filter role=Management --fields xname,role,subrole,desiredconfig
The output will be a table showing xname, role, subrole, and desired configuration set in CFS for all the management nodes on the system. For example:
+----------------+------------+---------+-------------------+
| xname | Role | SubRole | Desired Config |
+----------------+------------+---------+-------------------+
| x3000c0s1b0n0 | Management | Master | management-23.4.0 |
| x3000c0s2b0n0 | Management | Master | management-23.4.0 |
| x3000c0s3b0n0 | Management | Master | management-23.4.0 |
| x3000c0s4b0n0 | Management | Worker | management-23.4.0 |
| x3000c0s5b0n0 | Management | Worker | management-23.4.0 |
| x3000c0s6b0n0 | Management | Worker | management-23.4.0 |
| x3000c0s7b0n0 | Management | Worker | management-23.4.0 |
| x3000c0s8b0n0 | Management | Worker | management-23.4.0 |
| x3000c0s9b0n0 | Management | Worker | management-23.4.0 |
| x3000c0s10b0n0 | Management | Storage | management-23.4.0 |
| x3000c0s11b0n0 | Management | Storage | management-23.4.0 |
| x3000c0s12b0n0 | Management | Storage | management-23.4.0 |
+----------------+------------+---------+-------------------+
The value of the Desired Config
column is the name of the CFS configuration currently applied to the nodes. There
will typically be only one CFS configuration applied to all management nodes.
Kindly ensure the selected “desired config” from above has the ncn-initrd.yml
within the layers by using command as
below:
cray cfs configurations describe "<config name>"
lastUpdated = "<yyyy-mm-ddThh:mm:ssZ>"
name = "<config name>"
[[layers]]
...
[[layers]]
cloneUrl = "https://<repo.git>"
commit = "<hash value>"
name = "<name>"
playbook = "ncn-initrd.yml"
In case the “desired config” from above does not have ncn-initrd.yml
within the layers, create a configuration as
follows:
cat <filename>.json
{
"layers": [
{
"cloneUrl": "https://<repo.git>",
"commit": "<hash value>",
"name": "<name>",
"playbook": "site.yml"
},
{
"cloneUrl": "https://<repo.git>",
"commit": "<hash value>",
"name": "<name>",
"playbook": "ncn-initrd.yml"
}
]
}
cray cfs configurations update <configname> --file <filename>.json
lastUpdated = "<yyyy-mm-ddThh:mm:ssZ>"
name = "<configname>"
[[layers]]
cloneUrl = "https://<repo.git>"
"commit": "<hash value>"
"name": "<name>"
playbook = "site.yml"
[[layers]]
cloneUrl = "https://<repo.git>"
"commit": "<hash value>"
"name": "<name>"
playbook = "ncn-initrd.yml"
cray cfs configurations describe <configname> --format json
{
"lastUpdated": "<yyyy-mm-ddThh:mm:ssZ>",
"layers": [
{
"cloneUrl": "https://<repo.git>",
"commit": "<hash value>",
"name": "<name>",
"playbook": "site.yml"
},
{
"cloneUrl": "https://<repo.git>",
"commit": "<hash value>",
"name": "<name>",
"playbook": "ncn-initrd.yml"
}
],
"name": "<configname>"
}
Use this option to create a new CFS configuration for management nodes.
The following procedure describes how to create a CFS configuration that contains the minimal layers provided by CSM.
(ncn-mw#
) Clone the csm-config-management
repository.
VCS_USER=$(kubectl get secret -n services vcs-user-credentials --template={{.data.vcs_username}} | base64 --decode)
VCS_PASS=$(kubectl get secret -n services vcs-user-credentials --template={{.data.vcs_password}} | base64 --decode)
git clone "https://${VCS_USER}:${VCS_PASS}@api-gw-service-nmn.local/vcs/cray/csm-config-management.git"
A Git commit hash from this repository is needed in the following step. After cloning repo, verify that the YAML files used in image customization are present. If not, check the other branches in cloned repo and switch to correct branch which has the required YAML files.
git branch -a
cray/csm/<version>
* main
...
git checkout cray/csm/<version>
Switched to branch 'cray/csm/<version>'
Your branch is up to date with 'origin/cray/csm/<version>'.
...
ls
initrd.yml ...
(ncn-mw#
) Create the CFS configuration to use for image customization.
For more information on creating CFS configurations, see CFS Configurations.
The first layer in the CFS configuration should be similar to this:
{
"name": "csm-ncn-workers",
"clone_url": "https://api-gw-service-nmn.local/vcs/cray/csm-config-management.git",
"playbook": "ncn-worker_nodes.yml",
"commit": "<git commit hash>"
}
The last layer in the CFS configuration should be similar to this:
{
"name": "csm-ncn-initrd",
"clone_url": "https://api-gw-service-nmn.local/vcs/cray/csm-config-management.git",
"playbook": "ncn-initrd.yml",
"commit": "<git commit hash>"
}
(ncn-mw#
) Record the name of the CFS configuration to use for image customization.
Set the CFS_CONFIG_NAME
variable to the name of the CFS configuration identified or created in the previous
section, 2. Find or create a CFS configuration.
CFS_CONFIG_NAME=management-23.4.0
(ncn-mw#
) Ensure that the environment variable IMS_IMAGE_ID
is set.
This variable should have been set in the earlier step, 1. Determine management node image ID
echo "${IMS_IMAGE_ID}"
(ncn-mw#
) Create an image customization CFS session.
See Create an Image Customization CFS Session for additional information.
Set a name for the session.
CFS_SESSION_NAME="management-image-customization-$(date +%y%m%d%H%M%S)"
echo "${CFS_SESSION_NAME}"
Set the TARGET_GROUP
environment variable based on the type of management node for which the image is being
customized (master, worker, or storage).
For worker nodes:
TARGET_GROUP=Management_Worker
For master nodes:
TARGET_GROUP=Management_Master
For storage nodes:
TARGET_GROUP=Management_Storage
Create the session.
cray cfs v3 sessions create \
--name "${CFS_SESSION_NAME}" \
--configuration-name "${CFS_CONFIG_NAME}" \
--target-definition image --format json \
--target-group "${TARGET_GROUP}" "${IMS_IMAGE_ID}"
Monitor the CFS session until it completes successfully.
Obtain the IMS resultant image ID.
IMS_RESULTANT_IMAGE_ID=$(cray cfs v3 sessions describe "${CFS_SESSION_NAME}" --format json |
jq -r '.status.artifacts[] | select(.type == "ims_customized_image") | .result_id')
echo "${IMS_RESULTANT_IMAGE_ID}"
Example output:
a44ff301-6232-46b4-9ba6-88ee1d19e2c2
This step describes how to update each management node to boot from a new customized image. This is accomplished by updating the management nodes’ boot parameters in BSS to use the artifacts from the new customized image.
(ncn-mw#
) Set an environment variable for the IMS image ID.
If executing this step in the context of the entire procedure on this page, then use the
IMS_RESULTANT_IMAGE_ID
set in the previous step, 3. Run the CFS image customization session.
NEW_IMS_IMAGE_ID="${IMS_RESULTANT_IMAGE_ID}"
Update the boot parameters of the management nodes using the assign-ncn-images.sh
script. It can
be used to update the boot parameters for all management nodes in particular categories (master, worker,
or storage), or it can be used to update the boot parameters for specific management nodes.
Update the boot parameters for all master and worker nodes
/usr/share/doc/csm/scripts/operations/node_management/assign-ncn-images.sh \
-mw -p "${NEW_IMS_IMAGE_ID}"
Update the boot parameters for all master nodes
/usr/share/doc/csm/scripts/operations/node_management/assign-ncn-images.sh \
-m -p "${NEW_IMS_IMAGE_ID}"
Update the boot parameters for all worker nodes
/usr/share/doc/csm/scripts/operations/node_management/assign-ncn-images.sh \
-w -p "${NEW_IMS_IMAGE_ID}"
Update the boot parameters for all storage nodes
/usr/share/doc/csm/scripts/operations/node_management/assign-ncn-images.sh \
-s -p "${NEW_IMS_IMAGE_ID}"
Update the boot parameters for specific management nodes
Determine the component names (xnames) of the management nodes whose boot parameters are to be updated.
One option to determine the xname of a node is to read the /etc/cray/xname
file on it:
ssh ncn-w001 cat /etc/cray/xname
Run the script to update their boot parameters.
In the following example, be sure to replace the
<xname>
placeholder strings with the actual xnames of the management nodes whose boot parameters are to be updated.
/usr/share/doc/csm/scripts/operations/node_management/assign-ncn-images.sh \
-p "${NEW_IMS_IMAGE_ID}" <xname1> <xname2> ...
If this procedure is being performed as an operational task outside the context of an install or upgrade, then proceed to Rebuild NCNs when ready to reboot the management nodes to their new images.