NOTE: Some of the documentation linked from this page mentions use of the Boot Orchestration Service (BOS). The use of BOS is only relevant for booting compute nodes and can be ignored when working with NCN images.
This document describes the configuration of a Kubernetes NCN image. The same steps could be used to modify a Ceph image.
Identify the NCN image to be modified.
This example assumes that the administrator wants to modify the Kubernetes image that is currently in use by Kubernetes NCNs. However, the steps are the same for any Management NCN SquashFS image.
If the image to be modified is the image currently booted on an NCN, the value for ARTIFACT_VERSION
can be found by looking
at the boot parameters for the NCNs, or from /proc/cmdline
on a booted NCN. The version has the form of X.Y.Z
.
Obtain the NCN image’s associated artifacts (SquashFS, kernel, and initrd
).
These example commands show how to download these artifacts from S3, which is where the NCN image artifacts are stored.
ncn-mw# ARTIFACT_VERSION=<artifact-version>
ncn-mw# cray artifacts get ncn-images k8s/"${ARTIFACT_VERSION}"/filesystem.squashfs ./"${ARTIFACT_VERSION}"-filesystem.squashfs
ncn-mw# cray artifacts get ncn-images k8s/"${ARTIFACT_VERSION}"/kernel ./"${ARTIFACT_VERSION}"-kernel
ncn-mw# cray artifacts get ncn-images k8s/"${ARTIFACT_VERSION}"/initrd ./"${ARTIFACT_VERSION}"-initrd
ncn-mw# export IMS_ROOTFS_FILENAME="${ARTIFACT_VERSION}"-filesystem.squashfs
ncn-mw# export IMS_KERNEL_FILENAME="${ARTIFACT_VERSION}"-kernel
ncn-mw# export IMS_INITRD_FILENAME="${ARTIFACT_VERSION}"-initrd
Import the NCN image into IMS.
Perform the Import External Image to IMS procedure, except skip the following sections:
Clone the csm-config-management
repository.
ncn-mw# VCS_USER=$(kubectl get secret -n services vcs-user-credentials --template={{.data.vcs_username}} | base64 --decode)
ncn-mw# VCS_PASS=$(kubectl get secret -n services vcs-user-credentials --template={{.data.vcs_password}} | base64 --decode)
ncn-mw# git clone "https://${VCS_USER}:${VCS_PASS}@api-gw-service-nmn.local/vcs/cray/csm-config-management.git"
A Git commit hash from this repository is needed in the following step.
NOTE: A new initrd
must be generated at the end of the CFS session. Create a new Ansible
playbook named ncn-initrd.yml
at the root of the csm-config-management
repository. The new
playbook should have the following contents:
- hosts: Management_Worker
any_errors_fatal: true
remote_user: root
tasks:
- name: Create NCN initrd
command: /srv/cray/scripts/common/create-ims-initrd.sh
when: cray_cfs_image|default(false)|bool
This playbook should be the final layer in the CFS configuration created in the upcoming steps.
NOTE: If desired, the default SSH host keys can be removed with a CFS Ansible task:
- name: Remove SSH host keys
command: rm -f /etc/ssh/*key*
warn: false
New host keys will be generated by sshd
when the NCN boots for the first time.
Create an Image Customization CFS Session.
In this section, use the following values for the target definition and target group for the
cray cfs session create
command invocation:
--target-definition image
--target-group Management_Worker
Download the resultant NCN artifacts.
NOTE: $IMS_RESULTANT_IMAGE_ID
in the following commands is the result_id
returned in the output of the last command
in the previous step, which is repeated here for reference:
ncn-mw# cray cfs sessions describe example --format json | jq .status.artifacts
Download the resultant NCN artifacts:
ncn-mw# cray artifacts get boot-images "${IMS_RESULTANT_IMAGE_ID}/rootfs" "kubernetes-${ARTIFACT_VERSION}-1.squashfs"
ncn-mw# cray artifacts get boot-images "${IMS_RESULTANT_IMAGE_ID}/initrd" "initrd.img-${ARTIFACT_VERSION}-1.xz"
ncn-mw# cray artifacts get boot-images "${IMS_RESULTANT_IMAGE_ID}/kernel" "${ARTIFACT_VERSION}-1.kernel"
Upload NCN boot artifacts into S3.
This steps assumes that the docs-csm
RPM is installed. See Check for latest documentation.
ncn-mw# /usr/share/doc/csm/scripts/ceph-upload-file-public-read.py --bucket-name ncn-images --key-name "k8s/${ARTIFACT_VERSION}-1/filesystem.squashfs" --file-name "kubernetes-${ARTIFACT_VERSION}-1.squashfs"
ncn-mw# /usr/share/doc/csm/scripts/ceph-upload-file-public-read.py --bucket-name ncn-images --key-name "k8s/${ARTIFACT_VERSION}-1/initrd" --file-name "initrd.img-${ARTIFACT_VERSION}-1.xz"
ncn-mw# /usr/share/doc/csm/scripts/ceph-upload-file-public-read.py --bucket-name ncn-images --key-name "k8s/${ARTIFACT_VERSION}-1/kernel" --file-name "${ARTIFACT_VERSION}-1.kernel"
Update NCN boot parameters.
Get the existing metal.server
setting for the component name (xname) of the node of interest.
ncn-mw# XNAME=<node-xname>
ncn-mw# METAL_SERVER=$(cray bss bootparameters list --hosts "${XNAME}" --format json | jq '.[] |."params"' \
| awk -F 'metal.server=' '{print $2}' \
| awk -F ' ' '{print $1}')
Verify that the variable was set correctly.
ncn-mw# echo "${METAL_SERVER}"
Update the kernel, initrd
, and metal server to point to the new artifacts.
ncn-mw# S3_ARTIFACT_PATH="ncn-images/k8s/${ARTIFACT_VERSION}-1"
ncn-mw# NEW_METAL_SERVER="http://rgw-vip.nmn/${S3_ARTIFACT_PATH}"
ncn-mw# PARAMS=$(cray bss bootparameters list --hosts "${XNAME}" --format json | jq '.[] |."params"' | \
sed "/metal.server/ s|${METAL_SERVER}|${NEW_METAL_SERVER}|" | \
sed "s/metal.no-wipe=1/metal.no-wipe=0/" | \
tr -d \")
Verify that the value of $NEW_METAL_SERVER
was set correctly within the boot parameters.
ncn-mw# echo "${PARAMS}"
Update the boot parameters in BSS.
ncn-mw# cray bss bootparameters update --hosts "${XNAME}" \
--kernel "s3://${S3_ARTIFACT_PATH}/kernel" \
--initrd "s3://${S3_ARTIFACT_PATH}/initrd" \
--params "${PARAMS}"
Prepare for reboot.
NOTE: If the worker node image is being customized as part of a Cray EX initial install or upgrade involving multiple products,
then refer to the HPE Cray EX System Software Getting Started Guide (S-8000)
for details on when to reboot the worker nodes to the new image.
Failover any Postgres leader that is running on the worker node being rebooted.
ncn-mw# /usr/share/doc/csm/upgrade/1.2/scripts/k8s/failover-leader.sh <node to be rebooted>
Cordon and drain the node.
ncn-mw# kubectl drain --ignore-daemonsets=true --delete-local-data=true <node to be rebooted>
There may be pods that cannot be gracefully evicted because of Pod Disruption Budgets (PDB). This will result in messages like the following:
error when evicting pod "<pod>" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
In this case, there are some options. First, if the service is scalable, then increase the scale to start up another pod on another node, and then the drain will be able to delete it. However, it will probably be necessary to force the deletion of the pod:
ncn-mw# kubectl delete pod [-n <namespace>] --force --grace-period=0 <pod>
This will delete the offending pod, and Kubernetes should schedule a replacement on another node.
Then rerun the kubectl drain
command, and it should report that the node is drained.
ncn-mw# kubectl drain --ignore-daemonsets=true --delete-local-data=true <node to be rebooted>
SSH to the node and wipe the disks.
ncn-w# vgremove -f --select 'vg_name=~ceph*'
ncn-w# vgremove -f --select 'vg_name=~metal*'
ncn-w# sgdisk --zap-all /dev/sd*
ncn-w# wipefs --all --force /dev/sd* /dev/disk/by-label/*
Reboot the NCN.
ncn-w# shutdown -r now
IMPORTANT:
If the node does not shut down after 5 minutes, then proceed with the power reset below.
Power off the node.
read -s
is used to prevent the password from being written to the screen or the shell history.In the example commands below, be sure to replace
<node>
with the name of the node being rebooted. For example,ncn-w002
.
ncn-mw# USERNAME=root
ncn-mw# read -r -s -p "NCN BMC ${USERNAME} password: " IPMI_PASSWORD
ncn-mw# export IPMI_PASSWORD
ncn-mw# ipmitool -U "${USERNAME}" -E -H <node>-mgmt -I lanplus power off
ncn-mw# ipmitool -U "${USERNAME}" -E -H <node>-mgmt -I lanplus power status
Ensure that the power is reporting as off. It may take 5-10 seconds for this to update. Wait about 30 seconds after receiving the correct power status before issuing the next command.
Power on the node.
In the example commands below, be sure to replace
<node>
with the name of the node being rebooted. For example,ncn-w002
.
ncn-mw# ipmitool -U "${USERNAME}" -E -H <node>-mgmt -I lanplus power on
ncn-mw# ipmitool -U "${USERNAME}" -E -H <node>-mgmt -I lanplus power status
Ensure that the power is reporting as on. It may take 5-10 seconds for this to update.
Set metal.no-wipe=1
.
ncn-mw# PARAMS=$(cray bss bootparameters list --hosts "${XNAME}" --format json | jq '.[] |."params"' | \
sed "s/metal.no-wipe=0/metal.no-wipe=1/" | \
tr -d \")
ncn-mw# cray bss bootparameters update --hosts "${XNAME}" \
--params "${PARAMS}"