This page describes rebooting and deploying the non-compute node that is currently hosting the LiveCD.
These services must be healthy in Kubernetes before the reboot of the LiveCD can take place.
Required Platform Services:
cray-dhcp-kea
cray-dns-unbound
cray-bss
cray-sls
cray-s3
cray-ipxe
cray-tftp
While the node is rebooting, it will be available only through Serial-over-LAN and local terminals.
This procedure entails deactivating the LiveCD, meaning the LiveCD and all of its resources will be unavailable.
Check for workarounds in the /opt/cray/csm/workarounds/livecd-pre-reboot
directory. If there are any
workarounds in that directory, run those when the workaround instructs. Timing is critical to ensure properly loaded
data so run them only when indicated. Instructions are in the README
files.
# Example
pit# ls /opt/cray/csm/workarounds/livecd-pre-reboot
If there is a workaround here, the output looks similar to the following:
CASMINST-435
The steps in this guide will ultimately walk an administrator through loading hand-off data and rebooting the node. This will assist with remote-console setup, for observing the reboot.
At the end of these steps, the LiveCD will be no longer active. The node it was using will join the Kubernetes cluster as the final of 3 masters forming a quorum.
It is very important to run the livecd-pre-reboot workarounds. Ensure that the livecd-pre-reboot workarounds have all been run by the administrator before starting this stage.
Upload SLS file.
Note the system name environment variable
SYSTEM_NAME
must be set
pit# csi upload-sls-file --sls-file /var/www/ephemeral/prep/${SYSTEM_NAME}/sls_input_file.json
Expected output looks similar to the following:
2021/02/02 14:05:15 Retrieving S3 credentails ( sls-s3-credentials ) for SLS
2021/02/02 14:05:15 Uploading SLS file: /var/www/ephemeral/prep/eniac/sls_input_file.json
2021/02/02 14:05:15 Successfully uploaded SLS Input File.
Get a token to use for authenticated communication with the gateway.
pit# export TOKEN=$(curl -k -s -S -d grant_type=client_credentials \
-d client_id=admin-client \
-d client_secret=`kubectl get secrets admin-client-auth -o jsonpath='{.data.client-secret}' | base64 -d` \
https://api-gw-service-nmn.local/keycloak/realms/shasta/protocol/openid-connect/token | jq -r '.access_token')
Upload the same data.json
file we used to BSS, our Kubernetes cloud-init DataSource. If you have made any changes to this file as a result of any customizations or workarounds use the path to that file instead. This step will prompt for the root password of the NCNs.
pit# csi handoff bss-metadata --data-file /var/www/ephemeral/configs/data.json
Patch the metadata for the CEPH nodes to have the correct run commands:
pit# python3 /usr/share/doc/metal/scripts/patch-ceph-runcmd.py
Perform csi handoff, uploading NCN boot artifacts into S3.
Set variables
IMPORTANT: The variables you set depend on whether or not you followed the steps in 108-NCN-NTP.md#setting-a-local-timezone. The two paths forward are listed below:
If you customized the timezones, set the following variables:
pit# export artdir=/var/www/ephemeral/data
pit# export k8sdir=$artdir/k8s
pit# export cephdir=$artdir/ceph
If you did not customize the timezones, set the following variables (this is the default path):
pit# export CSM_RELEASE=csm-x.y.z
pit# export artdir=/var/www/ephemeral/${CSM_RELEASE}/images
pit# export k8sdir=$artdir/kubernetes
pit# export cephdir=$artdir/storage-ceph
After setting the variables above per your situation, run:
pit# csi handoff ncn-images \
--k8s-kernel-path $k8sdir/*.kernel \
--k8s-initrd-path $k8sdir/initrd.img*.xz \
--k8s-squashfs-path $k8sdir/*.squashfs \
--ceph-kernel-path $cephdir/*.kernel \
--ceph-initrd-path $cephdir/initrd.img*.xz \
--ceph-squashfs-path $cephdir/*.squashfs
List ipv4 boot options using efibootmgr
:
pit# efibootmgr | grep -Ei "ip(v4|4)"
Configure and trim UEFI entries on the PIT node.
On the PIT node, do the following two steps outlined in Fixing Boot-Order. For options that depend on the node type, treat it as a master node. 1. Setting Order 1. Trimming
Identify Port-1 of Riser-1 in efibootmgr
output and set PXEPORT
variable.
WARNING
Pay attention to the boot order, if does not show any of these examples please double-check that PCIe booting was enabled. Simply run-through the “Print Current UEFI and SR-IOV State” section, then fixup as needed with the “Setting Expected Values” section before returning here.
Example 1:
Possible output of
efibootmgr
command in the previous step:Boot0005* UEFI IPv4: Network 00 at Riser 02 Slot 01 Boot0007* UEFI IPv4: Network 01 at Riser 02 Slot 01 Boot000A* UEFI IPv4: Intel Network 00 at Baseboard Boot000C* UEFI IPv4: Intel Network 01 at Baseboard
Looking at the above output,
Network 00 at Riser 02 Slot 01
is reasonably our Port-1 of Riser-1, which would beBoot0005
in this case. SetPXEPORT
to the numeric suffix of that name.pit# export PXEPORT=0005
Example 2:
Possible output of
efibootmgr
command in the previous step:Boot0013* OCP Slot 10 Port 1 : Marvell FastLinQ 41000 Series - 2P 25GbE SFP28 QL41232HQCU-HC OCP3 Adapter - NIC - Marvell FastLinQ 41000 Series - 2P 25GbE SFP28 QL41232HQCU-HC OCP3 Adapter - PXE (HTTP(S) IPv4) Boot0014* OCP Slot 10 Port 1 : Marvell FastLinQ 41000 Series - 2P 25GbE SFP28 QL41232HQCU-HC OCP3 Adapter - NIC - Marvell FastLinQ 41000 Series - 2P 25GbE SFP28 QL41232HQCU-HC OCP3 Adapter - PXE (PXE IPv4) Boot0017* Slot 1 Port 1 : Marvell FastLinQ 41000 Series - 2P 25GbE SFP28 QL41232HLCU-HC MD2 Adapter - NIC - Marvell FastLinQ 41000 Series - 2P 25GbE SFP28 QL41232HLCU-HC MD2 Adapter - PXE (HTTP(S) IPv4) Boot0018* Slot 1 Port 1 : Marvell FastLinQ 41000 Series - 2P 25GbE SFP28 QL41232HLCU-HC MD2 Adapter - NIC - Marvell FastLinQ 41000 Series - 2P 25GbE SFP28 QL41232HLCU-HC MD2 Adapter - PXE (PXE IPv4)
In the above output, look for the non-OCP device that has “PXE IPv4” rather than “HTTP(S) IPv4”, which would be
Boot0018
in this case. SetPXEPORT
to the numeric suffix of that name.pit# export PXEPORT=0018
Example 3:
The output below if what a Gigabyte machine may look like.
Boot000B* UEFI: PXE IP4 Mellanox Network Adapter - B8:59:9F:C7:12:F2 Boot000C* UEFI: PXE IP4 Mellanox Network Adapter - B8:59:9F:C7:12:F3 Boot000D* UEFI: PXE IP6 Mellanox Network Adapter - B8:59:9F:C7:12:F2 Boot000E* UEFI: PXE IP6 Mellanox Network Adapter - B8:59:9F:C7:12:F3
Use efibootmgr
to set next boot device to port selected in the previous step.
pit# efibootmgr -n $PXEPORT 2>&1 | grep -i BootNext
If PXEPORT
was set to 0005, the expected output looks similar to the following:
BootNext: 0005
SKIP THIS STEP IF USING USB LIVECD
The remote LiveCD will lose all changes and local data once it is rebooted.
It is advised to backup the prep directory for the LiveCD off of the CRAY before rebooting. This will facilitate setting the LiveCD up again in the event of a bad reboot. Follow the procedure in Virtual ISO Boot - Backing up the OverlayFS. After completing that, return here and proceed to the next step.
Optionally setup conman or serial console if not already on one from any laptop
external# script -a boot.livecd.$(date +%Y-%m-%d).txt
external# export PS1='\u@\H \D{%Y-%m-%d} \t \w # '
external# SYSTEM_NAME=eniac
external# username=root
external# IPMI_PASSWORD=changeme
external# ipmitool -I lanplus -U $username -E -H ${SYSTEM_NAME}-ncn-m001-mgmt chassis power status
external# ipmitool -I lanplus -U $username -E -H ${SYSTEM_NAME}-ncn-m001-mgmt sol activate
Collect the CAN IPs for logging into other NCNs while this happens. This is useful for interacting
and debugging the Kubernetes cluster while the LiveCD is offline
.
pit# ssh ncn-m002
ncn-m002# ip a show vlan007 | grep inet
Expected output looks similar to the following:
inet 10.102.11.13/24 brd 10.102.11.255 scope global vlan007
inet6 fe80::1602:ecff:fed9:7820/64 scope link
Now log in from another machine to verify that the IP address is usable
external# ssh root@10.102.11.13
ncn-m002#
Keep this terminal active as it will enable kubectl
commands during the bring-up of the new NCN.
If the reboot successfully deploys the LiveCD, this terminal can be exited.
Wipe the node beneath the LiveCD, erasing the RAIDs labels will trigger a fresh partition table to deploy.
WARNING
Do not assume to wipe the first three disks (e.g.sda, sdb, and sdc
), these could be any letter. Choosing the wrong ones may result in wiping the USB stick.
Select disks to wipe; SATA/NVME/SAS
pit# md_disks="$(lsblk -l -o SIZE,NAME,TYPE,TRAN | grep -E '(sata|nvme|sas)' | sort -h | awk '{print "/dev/" $2}')"
Sanity check; print disks into typescript or console
pit# echo $md_disks
Expected output looks similar to the following:
/dev/sda /dev/sdb /dev/sdc
Wipe. This is irreversible.
pit# wipefs --all --force $md_disks
If any disks had labels present, output looks similar to the following:
/dev/sda: 8 bytes were erased at offset 0x00000200 (gpt): 45 46 49 20 50 41 52 54
/dev/sda: 8 bytes were erased at offset 0x6fc86d5e00 (gpt): 45 46 49 20 50 41 52 54
/dev/sda: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa
/dev/sdb: 6 bytes were erased at offset 0x00000000 (crypto_LUKS): 4c 55 4b 53 ba be
/dev/sdb: 6 bytes were erased at offset 0x00004000 (crypto_LUKS): 53 4b 55 4c ba be
/dev/sdc: 8 bytes were erased at offset 0x00000200 (gpt): 45 46 49 20 50 41 52 54
/dev/sdc: 8 bytes were erased at offset 0x6fc86d5e00 (gpt): 45 46 49 20 50 41 52 54
/dev/sdc: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa
The thing to verify is that there are no error messages in the output.
Quit the typescript session with the exit
command and copy the file (booted-csm-livecd.<date>.txt
) to a location on another server for reference later.
pit# exit
Reboot the LiveCD.
pit# reboot
The node should boot, acquire its hostname (i.e. ncn-m001
), and run cloud-init
.
NOTE
: If the node has PXE boot issues, such as getting PXE errors or not pulling theipxe.efi
binary, see PXE boot troubleshooting
NOTE
: Ifncn-m001
booted without a hostname (asncn
) or it did not run all thecloud-init
scripts, the following commands need to be run (but only in that circumstance).
ncn-m001# mkdir -pv /mnt/cow
ncn-m001# mount -vL cow /mnt/cow
ncn-m001# cp -pv /mnt/cow/rw/etc/sysconfig/network/ifroute-vlan* /etc/sysconfig/network/
ncn-m001# cp -pv /mnt/cow/rw/etc/sysconfig/network/ifcfg-lan0 /etc/sysconfig/network/
set-dhcp-to-static.sh
script
ncn-m001# /srv/cray/scripts/metal/set-dhcp-to-static.sh
ncn-m001# cloud-init clean
ncn-m001# cloud-init init
ncn-m001# cloud-init modules -m init
ncn-m001# cloud-init modules -m config
ncn-m001# cloud-init modules -m final
Once cloud-init
has completed successfully, log in and start a typescript (the IP address used here is the one we noted for ncn-m002
in an earlier step).
external# ssh root@10.102.11.13
ncn-m002# ssh ncn-m001
ncn-m001# script -a verify.csm.$(date +%Y-%m-%d).txt
ncn-m001# export PS1='\u@\H \D{%Y-%m-%d} \t \w # '
Optionally change the root password on ncn-m001
to match the other management NCNs.
This step is optional and is only needed when the other management NCNs passwords were customized during the CSM Metal Install procedure. If the management NCNs still have the default password this step can be skipped.
ncn-m001# passwd
Run kubectl get nodes
to see the full Kubernetes cluster.
NOTE
If the new node fails to join the cluster after running other cloud-init items please refer to thehandoff
ncn-m001# kubectl get nodes
Expected output looks similar to the following:
NAME STATUS ROLES AGE VERSION
ncn-m001 Ready master 7s v1.18.6
ncn-m002 Ready master 4h40m v1.18.6
ncn-m003 Ready master 4h38m v1.18.6
ncn-w001 Ready <none> 4h39m v1.18.6
ncn-w002 Ready <none> 4h39m v1.18.6
ncn-w003 Ready <none> 4h39m v1.18.6
Follow the procedure defined in Accessing CSI from a USB or RemoteISO.
Restore and verify the site link. It will be necessary to restore the ifcfg-lan0
and ifroute-vlan002
files from either the manual backup taken in step 6 or remount the USB and copy it from the prep directory to /etc/sysconfig/network/
.
ncn-m001# export SYSTEM_NAME=eniac
ncn-m001# cp -v /mnt/pitdata/prep/${SYSTEM_NAME}/pit-files/ifcfg-lan0 /etc/sysconfig/network/
ncn-m001# cp -v /mnt/pitdata/prep/${SYSTEM_NAME}/pit-files/ifroute-vlan002 /etc/sysconfig/network/
ncn-m001# wicked ifreload lan0
Run ip a
to show our IPs, verify the site link.
ncn-m001# ip a show lan0
Run ip a
to show our VLANs, verify they all have IPs.
ncn-m001# ip a show vlan002
ncn-m001# ip a show vlan004
ncn-m001# ip a show vlan007
Verify we do not have a metal bootstrap IP.
ncn-m001# ip a show bond0
Enable NCN Disk Wiping Safeguard to prevent destructive behavior from occurring during reboot.
NOTE
This safeguard needs to be removed to facilitate bare-metal deployments of new nodes. The linked Enable NCN Disk Wiping Safeguard procedure can be used to disable the safeguard by setting the value back to0
.
Download and install/upgrade the workaround and documentation RPMs. If this machine does not have direct internet access these RPMs will need to be externally downloaded and then copied to be installed.
ncn-m001# rpm -Uvh --force https://storage.googleapis.com/csm-release-public/shasta-1.4/docs-csm/docs-csm-latest.noarch.rpm
ncn-m001# rpm -Uvh --force https://storage.googleapis.com/csm-release-public/shasta-1.4/csm-install-workarounds/csm-install-workarounds-latest.noarch.rpm
Now check for workarounds in the /opt/cray/csm/workarounds/livecd-post-reboot
directory. If there are any workarounds in that directory, run those now. Each has its own instructions in their respective README.md
files.
Note: The following command assumes that the data partition of the USB stick has been remounted at /mnt/pitdata
# Example
ncn-m001# ls /opt/cray/csm/workarounds/livecd-post-reboot
If there are workarounds here, the output looks similar to the following:
CASMINST-1309 CASMINST-1570
At this time, the NCN cluster is fully established. The administrator may now eject any mounted USB stick:
ncn-m001# umount -v /mnt/rootfs /mnt/sqfs /mnt/livecd /mnt/pitdata
The administrator can continue onto CSM Validation to conclude the CSM product deployment.
There are some operational steps to be taken in NCN/Management Node Locking and then Firmware updates with FAS
Then the administrator should install additional products following the procedures in the HPE Cray EX System Installation and Configuration Guide S-8000.
After deploying the LiveCD’s NCN, the LiveCD USB itself is unharmed and available to an administrator.
Mount and view the USB stick:
ncn-m001# mkdir -pv /mnt/{cow,pitdata}
ncn-m001# mount -vL cow /mnt/cow
ncn-m001# mount -vL PITDATA /mnt/pitdata
ncn-m001# ls -ld /mnt/cow/rw/*
Example output:
drwxr-xr-x 2 root root 4096 Jan 28 15:47 /mnt/cow/rw/boot
drwxr-xr-x 8 root root 4096 Jan 29 07:25 /mnt/cow/rw/etc
drwxr-xr-x 3 root root 4096 Feb 5 04:02 /mnt/cow/rw/mnt
drwxr-xr-x 3 root root 4096 Jan 28 15:49 /mnt/cow/rw/opt
drwx------ 10 root root 4096 Feb 5 03:59 /mnt/cow/rw/root
drwxrwxrwt 13 root root 4096 Feb 5 04:03 /mnt/cow/rw/tmp
drwxr-xr-x 7 root root 4096 Jan 28 15:40 /mnt/cow/rw/usr
drwxr-xr-x 7 root root 4096 Jan 28 15:47 /mnt/cow/rw/var
Look at the contents of /mnt/pitdata
:
ncn-m001# ls -ld /mnt/pitdata/*
Example output:
drwxr-xr-x 2 root root 4096 Feb 3 04:32 /mnt/pitdata/configs
drwxr-xr-x 14 root root 4096 Feb 3 07:26 /mnt/pitdata/csm-0.7.29
-rw-r--r-- 1 root root 22159328586 Feb 2 22:18 /mnt/pitdata/csm-0.7.29.tar.gz
drwxr-xr-x 4 root root 4096 Feb 3 04:25 /mnt/pitdata/data
drwx------ 2 root root 16384 Jan 28 15:41 /mnt/pitdata/lost+found
drwxr-xr-x 5 root root 4096 Feb 3 04:20 /mnt/pitdata/prep
drwxr-xr-x 2 root root 4096 Jan 28 16:07 /mnt/pitdata/static
Be kind, unmount the USB before ejecting it:
ncn-m001# umount -v /mnt/cow /mnt/pitdata
CSI is not installed on the NCNs, however it is compiled against the same base architecture and OS. Therefore, it can be accessed by any LiveCD ISO file if not the one used for the original installation.
Set the CSM Release
ncn# export CSM_RELEASE=csm-x.y.z
Make directories.
ncn# mkdir -pv /mnt/livecd /mnt/rootfs /mnt/sqfs /mnt/pitdata
Mount the rootfs.
ncn# mount -vL PITDATA /mnt/pitdata
ncn# mount -v /mnt/pitdata/${CSM_RELEASE}/cray-pre-install-toolkit-*.iso /mnt/livecd/
ncn# mount -v /mnt/livecd/LiveOS/squashfs.img /mnt/sqfs/
ncn# mount -v /mnt/sqfs/LiveOS/rootfs.img /mnt/rootfs/
Invoke CSI usage to validate it runs and is ready for use:
ncn# /mnt/rootfs/usr/bin/csi --help
Copy the CSI binary and CSM workaround documentation off to tmp/
ncn# cp -pv /mnt/rootfs/usr/bin/csi /tmp/csi
For more information about the safeguard, see
/usr/share/doc/metal-dracut/mdsquash
on any NCN. (view $(rpm -qi --fileprovide dracut-metal-mdsquash | grep -i readme)
).
After all the NCNs have been installed, it is imperative to disable the automated wiping of disks so subsequent boots do not destroy any data unintentionally. First follow the procedure above to remount the assets and then get a new token:
ncn-m001# export TOKEN=$(curl -k -s -S -d grant_type=client_credentials \
-d client_id=admin-client \
-d client_secret=`kubectl get secrets admin-client-auth -o jsonpath='{.data.client-secret}' | base64 -d` \
https://api-gw-service-nmn.local/keycloak/realms/shasta/protocol/openid-connect/token | jq -r '.access_token')
Followed by a call to CSI to update BSS:
ncn-m001# /tmp/csi handoff bss-update-param --set metal.no-wipe=1
NOTE
/tmp/csi
will delete itself on the next reboot since /tmp/ is mounted as tmpfs and does not persist no matter what.