NCN Worker Image Customization

NOTE: Some of the documentation linked from this page mentions use of the Boot Orchestration Service (BOS). The use of BOS is only relevant for booting compute nodes and can be ignored when working with NCN images.

This document describes the configuration of a Kubernetes NCN image. The same steps could be used to modify a Ceph image.

  1. Identify the NCN image to be modified.

    This example assumes that the administrator wants to modify the Kubernetes image that is currently in use by Kubernetes NCNs. However, the steps are the same for any Management NCN SquashFS image.

    If the image to be modified is the image currently booted on an NCN, the value for ARTIFACT_VERSION can be found by looking at the boot parameters for the NCNs, or from /proc/cmdline on a booted NCN. The version has the form of X.Y.Z.

  2. Obtain the NCN image’s associated artifacts (SquashFS, kernel, and initrd).

    These example commands show how to download these artifacts from S3, which is where the NCN image artifacts are stored.

    ncn-mw# ARTIFACT_VERSION=<artifact-version>
    
    ncn-mw# cray artifacts get ncn-images k8s/"${ARTIFACT_VERSION}"/filesystem.squashfs ./"${ARTIFACT_VERSION}"-filesystem.squashfs
    
    ncn-mw# cray artifacts get ncn-images k8s/"${ARTIFACT_VERSION}"/kernel ./"${ARTIFACT_VERSION}"-kernel
    
    ncn-mw# cray artifacts get ncn-images k8s/"${ARTIFACT_VERSION}"/initrd ./"${ARTIFACT_VERSION}"-initrd
    
    ncn-mw# export IMS_ROOTFS_FILENAME="${ARTIFACT_VERSION}"-filesystem.squashfs
    
    ncn-mw# export IMS_KERNEL_FILENAME="${ARTIFACT_VERSION}"-kernel
    
    ncn-mw# export IMS_INITRD_FILENAME="${ARTIFACT_VERSION}"-initrd
    
  3. Import the NCN image into IMS.

    Perform the Import External Image to IMS procedure, except skip the following sections:

  4. Clone the csm-config-management repository.

    ncn-mw# VCS_USER=$(kubectl get secret -n services vcs-user-credentials --template={{.data.vcs_username}} | base64 --decode)
    ncn-mw# VCS_PASS=$(kubectl get secret -n services vcs-user-credentials --template={{.data.vcs_password}} | base64 --decode)
    ncn-mw# git clone "https://${VCS_USER}:${VCS_PASS}@api-gw-service-nmn.local/vcs/cray/csm-config-management.git"
    

    A Git commit hash from this repository is needed in the following step.

  5. Create a CFS Configuration.

    NOTE: A new initrd must be generated at the end of the CFS session. Create a new Ansible playbook named ncn-initrd.yml at the root of the csm-config-management repository. The new playbook should have the following contents:

    - hosts: Management_Worker
      any_errors_fatal: true
      remote_user: root
    
      tasks:
      - name: Create NCN initrd
        command: /srv/cray/scripts/common/create-ims-initrd.sh
        when: cray_cfs_image|default(false)|bool
    

    This playbook should be the final layer in the CFS configuration created in the upcoming steps.

    NOTE: If desired, the default SSH host keys can be removed with a CFS Ansible task:

    - name: Remove SSH host keys
      command: rm -f /etc/ssh/*key*
      warn: false
    

    New host keys will be generated by sshd when the NCN boots for the first time.

  6. Create an Image Customization CFS Session.

    In this section, use the following values for the target definition and target group for the cray cfs session create command invocation:

    --target-definition image
    --target-group Management_Worker
    
  7. Download the resultant NCN artifacts.

    NOTE: $IMS_RESULTANT_IMAGE_ID in the following commands is the result_id returned in the output of the last command in the previous step, which is repeated here for reference:

    ncn-mw# cray cfs sessions describe example --format json | jq .status.artifacts
    

    Download the resultant NCN artifacts:

    ncn-mw# cray artifacts get boot-images "${IMS_RESULTANT_IMAGE_ID}/rootfs" "kubernetes-${ARTIFACT_VERSION}-1.squashfs"
    
    ncn-mw# cray artifacts get boot-images "${IMS_RESULTANT_IMAGE_ID}/initrd" "initrd.img-${ARTIFACT_VERSION}-1.xz"
    
    ncn-mw# cray artifacts get boot-images "${IMS_RESULTANT_IMAGE_ID}/kernel" "${ARTIFACT_VERSION}-1.kernel"
    
  8. Upload NCN boot artifacts into S3.

    This steps assumes that the docs-csm RPM is installed. See Check for latest documentation.

    ncn-mw# /usr/share/doc/csm/scripts/ceph-upload-file-public-read.py --bucket-name ncn-images --key-name "k8s/${ARTIFACT_VERSION}-1/filesystem.squashfs" --file-name "kubernetes-${ARTIFACT_VERSION}-1.squashfs"
    
    ncn-mw# /usr/share/doc/csm/scripts/ceph-upload-file-public-read.py --bucket-name ncn-images --key-name "k8s/${ARTIFACT_VERSION}-1/initrd" --file-name "initrd.img-${ARTIFACT_VERSION}-1.xz"
    
    ncn-mw# /usr/share/doc/csm/scripts/ceph-upload-file-public-read.py --bucket-name ncn-images --key-name "k8s/${ARTIFACT_VERSION}-1/kernel" --file-name "${ARTIFACT_VERSION}-1.kernel"
    
  9. Update NCN boot parameters.

    1. Get the existing metal.server setting for the component name (xname) of the node of interest.

      ncn-mw# XNAME=<node-xname>
      ncn-mw# METAL_SERVER=$(cray bss bootparameters list --hosts "${XNAME}" --format json | jq '.[] |."params"' \
               | awk -F 'metal.server=' '{print $2}' \
               | awk -F ' ' '{print $1}')
      
    2. Verify that the variable was set correctly.

      ncn-mw# echo "${METAL_SERVER}"
      
    3. Update the kernel, initrd, and metal server to point to the new artifacts.

      ncn-mw# S3_ARTIFACT_PATH="ncn-images/k8s/${ARTIFACT_VERSION}-1"
      ncn-mw# NEW_METAL_SERVER="http://rgw-vip.nmn/${S3_ARTIFACT_PATH}"
      
      ncn-mw# PARAMS=$(cray bss bootparameters list --hosts "${XNAME}" --format json | jq '.[] |."params"' | \
               sed "/metal.server/ s|${METAL_SERVER}|${NEW_METAL_SERVER}|" | \
               sed "s/metal.no-wipe=1/metal.no-wipe=0/" | \
               tr -d \")
      
    4. Verify that the value of $NEW_METAL_SERVER was set correctly within the boot parameters.

      ncn-mw# echo "${PARAMS}"
      
    5. Update the boot parameters in BSS.

      ncn-mw# cray bss bootparameters update --hosts "${XNAME}"   \
               --kernel "s3://${S3_ARTIFACT_PATH}/kernel" \
               --initrd "s3://${S3_ARTIFACT_PATH}/initrd" \
               --params "${PARAMS}"
      
  10. Prepare for reboot.

    NOTE: If the worker node image is being customized as part of a Cray EX initial install or upgrade involving multiple products, then refer to the HPE Cray EX System Software Getting Started Guide (S-8000) for details on when to reboot the worker nodes to the new image.

    1. Failover any Postgres leader that is running on the worker node being rebooted.

      ncn-mw# /usr/share/doc/csm/upgrade/1.2/scripts/k8s/failover-leader.sh <node to be rebooted>
      
    2. Cordon and drain the node.

      ncn-mw# kubectl drain --ignore-daemonsets=true --delete-local-data=true <node to be rebooted>
      

      There may be pods that cannot be gracefully evicted because of Pod Disruption Budgets (PDB). This will result in messages like the following:

      error when evicting pod "<pod>" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
      

      In this case, there are some options. First, if the service is scalable, then increase the scale to start up another pod on another node, and then the drain will be able to delete it. However, it will probably be necessary to force the deletion of the pod:

      ncn-mw# kubectl delete pod [-n <namespace>] --force --grace-period=0 <pod>
      

      This will delete the offending pod, and Kubernetes should schedule a replacement on another node. Then rerun the kubectl drain command, and it should report that the node is drained.

      ncn-mw# kubectl drain --ignore-daemonsets=true --delete-local-data=true <node to be rebooted>
      
    3. SSH to the node and wipe the disks.

      ncn-w# vgremove -f --select 'vg_name=~ceph*'
      ncn-w# vgremove -f --select 'vg_name=~metal*'
      ncn-w# sgdisk --zap-all /dev/sd*
      ncn-w# wipefs --all --force /dev/sd* /dev/disk/by-label/*
      
  11. Reboot the NCN.

    ncn-w# shutdown -r now
    

    IMPORTANT: If the node does not shut down after 5 minutes, then proceed with the power reset below.

    1. Power off the node.

      read -s is used to prevent the password from being written to the screen or the shell history.

      In the example commands below, be sure to replace <node> with the name of the node being rebooted. For example, ncn-w002.

      ncn-mw# USERNAME=root
      ncn-mw# read -r -s -p "NCN BMC ${USERNAME} password: " IPMI_PASSWORD
      ncn-mw# export IPMI_PASSWORD
      ncn-mw# ipmitool -U "${USERNAME}" -E -H <node>-mgmt -I lanplus power off
      ncn-mw# ipmitool -U "${USERNAME}" -E -H <node>-mgmt -I lanplus power status
      

      Ensure that the power is reporting as off. It may take 5-10 seconds for this to update. Wait about 30 seconds after receiving the correct power status before issuing the next command.

    2. Power on the node.

      In the example commands below, be sure to replace <node> with the name of the node being rebooted. For example, ncn-w002.

      ncn-mw# ipmitool -U "${USERNAME}" -E -H <node>-mgmt -I lanplus power on
      ncn-mw# ipmitool -U "${USERNAME}" -E -H <node>-mgmt -I lanplus power status
      

      Ensure that the power is reporting as on. It may take 5-10 seconds for this to update.

  12. Set metal.no-wipe=1.

    ncn-mw# PARAMS=$(cray bss bootparameters list --hosts "${XNAME}" --format json | jq '.[] |."params"' | \
             sed "s/metal.no-wipe=0/metal.no-wipe=1/" | \
             tr -d \")
    
    ncn-mw# cray bss bootparameters update --hosts "${XNAME}" \
             --params "${PARAMS}"