This page will prepare you for a CSM install using the LiveCD in different scenarios.
The prerequisites for each install scenario are defined here. All prerequisites must be met before commencing an installation.
After finishing any of these prerequisite guides, an administrator may move to Starting an Installation.
Each item below defines a prerequisite that must be completed on systems with existing Shasta v1.3 installations. Optional steps are noted as such.
Although some configuration data can be saved from a Shasta v1.3 system, there are new configuration files needed for Shasta v1.4. Some of this data is easier to collect from a running Shasta v1.3 system.
There may be some operational data to be saved such as any nodes which are disabled or marked down in a workload manager. These nodes might need hardware or firmware actions to repair them. If not addressed, and the newer firmware in v1.4 does not improve their performance or operation, then these may need to be disabled with v1.4 as well.
There may be site modifications to the system from v1.3 which are desired in v1.4. They cannot be directly copied to v1.4, however, recommendation will be made about what to save. Some saved information from v1.3 may be referenced when making a similar site modification to v1.3.
See the Harvest Shasta v1.3 Information page for the data harvesting procedure.
See the service guides for information regarding the v1.4 configuration files.
Follow site processes to quiesce the system, such as draining workload manager queues, saving user data somewhere off the system, and limiting new logins to application nodes.
Check for running slurm jobs
ncn-w001# ssh nid001000 squeue -l
Check for running PBS jobs
ncn-w001# ssh nid001000 qstat -Q
ncn-w001# ssh nid001000 qstat -a
Obtain the authorization key for SAT.
See System Security and Authentication, Authenticate an Account with the Command Line, SAT Authentication in the Cray Shasta Administration Guide 1.3 S-8001 for more information.
v1.3.0: Use Rev C of the guide v1.3.2: Use Rev E or later
Check for running sessions from BOS, CFS, CRUS, FAS, and NMD.
ncn-w001# sat bootsys shutdown --stage session-checks
Expected output will look something like this:
Checking for active BOS sessions.
Found no active BOS sessions.
Checking for active CFS sessions.
Found no active CFS sessions.
Checking for active CRUS upgrades.
Found no active CRUS upgrades.
Checking for active FAS actions.
Found no active FAS actions.
Checking for active NMD dumps.
Found no active NMD dumps.
No active sessions exist. It is safe to proceed with the shutdown procedure.
Coordinate amongst system administration staff to prevent new sessions from starting in the services listed.
In CLE release 1.3, there is no method to prevent new sessions from being created as long as the service APIs are accessible on the API gateway
Shut down and power off all compute nodes and application nodes.
v1.3.2 use the “sat bootsys shutdown” command to do compute nodes and application nodes at the same time. See the Cray Shasta Administration Guide 1.3 S-8001 RevF (or later) in the section “Shut Down and Power Off Compute and User Access Nodes”
v1.3.0 use the “cray bos” command to create a shutdown session for the compute nodes and then for the application nodes See the Cray Shasta Administration Guide 1.3 S-8001 RevC (or later) in the section “Shut Down and Power Off Compute and User Access Nodes”
The management NCNs are expected to have certain minimum firmware installed for BMC, node BIOS, and PCIe card firmware.
Known issues for Shasta v1.3 systems include:
For minimum BIOS spec (required settings), see Node BIOS Preferences.
For minimum NCN firmware versions see Node Firmware.
For minimum Network switch firmware versions see Network Firmware.
For minimum Network switch configurations see Management Network Install.
WARNING
Skipping this on a system that is new to Shasta v1.4 (bare-metal or previously installed with Shasta v1.3 or earlier) can result in undesirable difficulties:
- Misnamed interfaces (missing
hsn0
)- Malfunctioning bonds (
bond0
)- Link failures (i.e. QLogic cards set to 10Gbps fixed)
- Malfunctioning or disabled Multi-Chassis LAGG
- Back-firing work-around scripts
The Shasta v1.3 system needs to change a few connections.
Installs in Shasta v1.4 are based on ncn-m001, which functions as the PIT (Pre-Install-Toolkit) node booted from the LiveCD with Shasta v1.4, instead of ncn-w001, which was the BIS node used for installation with Shasta v1.3 software. Systems are required to change their cabling to match.
See moving site connections to complete this step.
This is strongly encouraged to prevent overhead when adding new NCNs that the existing NCNs are re-cabled to facilitate PCIe PXE booting and “keeping NCNs the same.”
Installs for NCNs support PCIe PXE booting for deployment. Previous installations of Shasta v1.3 and earlier used their onboard interfaces to start PXE, before pivoting to their faster PCIe ports for Linux install. Now, everything is over the PCIe network interface card.
See PCIe Net-boot and Recable for information on enabling PCIe card PXE boot.
Shut down Ceph and the Kubernetes management cluster. This performs several actions to quiesce the management services and leaves each management NCN running Linux, but no other services.
If Shasta v1.3.2 or later is installed, use this command.
Shutdown platform services.
ncn-w001# sat bootsys shutdown --stage platform-services
If Shasta v1.3.0 or v1.3.1 is installed, follow these three steps from Cray Shasta Administration Guide 1.3 S-8001 RevC in section 6.6 Shut Down and Power Off the Management Kubernetes Cluster.
ncn-w001# ansible-playbook \
/opt/cray/crayctl/ansible_framework/main/platform-shutdown.yml
The management NCNs need to be powered off to facilitate a 1.4 install. Wiping the node will avoid boot mistakes, making the only viable option the PXE option. Below, use Ansible for wiping and shutting down the NCNs.
Since 1.3 installs used ncn-w001 as a place to run Ansible and host Ansible inventory, we will start by jumping from the manager node to ncn-w001.
# jumpbox
ncn-m001# ssh ncn-w001
ncn-w001#
Wipe disks on all nodes:
ncn-w001# ansible ncn -m shell -a 'wipefs --all --force /dev/sd*'
For disks which have no labels, no output will be shown by the wipefs commands being run. If one or more disks have labels, output similar to the following is expected:
/dev/sda: 8 bytes were erased at offset 0x00000200 (gpt): 45 46 49 20 50 41 52 54
/dev/sda: 8 bytes were erased at offset 0x6fc86d5e00 (gpt): 45 46 49 20 50 41 52 54
/dev/sda: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa
/dev/sdb: 8 bytes were erased at offset 0x00000200 (gpt): 45 46 49 20 50 41 52 54
/dev/sdb: 8 bytes were erased at offset 0x6fc86d5e00 (gpt): 45 46 49 20 50 41 52 54
/dev/sdb: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa
/dev/sdc: 6 bytes were erased at offset 0x00000000 (crypto_LUKS): 4c 55 4b 53 ba be
/dev/sdc: 6 bytes were erased at offset 0x00004000 (crypto_LUKS): 53 4b 55 4c ba be
The thing to verify is that there are no error messages in the output.
Power off all other nodes except ncn-w001 and ncn-m001.
ncn-w001# ansible ncn -m shell --limit='!ncn-w001:!ncn-m001' -a 'ipmitool power off'
Power off ncn-w001:
ncn-w001# ipmitool power off
At this time all that is left powered on is ncn-m001. The final ipmitool power off
command should disconnect the administrator from ncn-w001, leaving them on ncn-m001.
If the connection fails to disconnect, an administrator can escape and disconnect IPMI without exiting their SSH session by pressing ~.
until ipmitool
disconnects.
Save collected typescript and health check information off the system so it can be referenced later
Next: Starting an Installation
The system is now ready for Starting an Installation.
Each item below defines a prerequisite necessary for a bare-metal installation to succeed.
NOTE
On bare-metal, the LiveCD tool will assist with these steps.
A 1TB USB3.0 USB stick will be required in order to create a bootable LiveCD.
The LiveCD itself can be used out-of-the-box, and with only a little configuration it can serve for the various bare-metal prerequisite tasks.
Experimental - See LiveCD Quick Setup for either remote ISO path, this is useful for exploring a new system quickly. Other lab users may prefer a bootable USB stick to enable persistence, and for bringing artifacts for firmware updates.
Once you are booted into a LiveCD, proceed onto the next prerequisite steps for bare-metal.
New configuration files are needed for the installation of Shasta v1.4.
See the Service Guides for information regarding the four files.
To complete this step, the network configuration needs to be applied. For information on bare configurations, firmware, and more, see Management network install.
The management NCNs are expected to have certain minimum firmware installed for BMC, node BIOS, and PCIe card firmware. Where possible, the firmware should be updated prior to install. Some firmware can be updated during or after the Shasta v1.4 installation, but it is better to meet the minimum NCN firmware requirement before starting.
For minimum BIOS spec (required settings), see Node BIOS Preferences.
For minimum NCN firmware versions see Node Firmware.
For minimum Network switch firmware versions see Network Firmware.
For minimum Network switch configurations see Management Network Install.
WARNING
Skipping this on a system that is new to Shasta v1.4 (bare-metal or previously installed with Shasta v1.3 or earlier) can result in undesirable difficulties:
- Misnamed interfaces (missing
hsn0
)- Malfunctioning bonds (
bond0
)- Link failures (i.e. QLogic cards set to 10Gbps fixed)
- Malfunctioning or disabled Multi-Chassis LAGG
- Back-firing work-around scripts
Next: Starting an Installation
The system is now ready for Starting an Installation.
The following prerequisites must be completed in order to successfully reinstall Shasta v1.4.
Runtime DHCP services interfere with the LiveCD’s bootstrap nature to provide DHCP leases to BMCs. To remove edge-cases, disable the run-time cray-dhcp-kea pod.
Scale the deployment from either the LiveCD or any Kubernetes node
ncn# kubectl scale -n services --replicas=0 deployment cray-dhcp-kea
UANs and CNs do not need to be powered off.
The steps below detail how to prepare the NCNs.
Degraded System Notice
If the system is degraded; CRAY services are down, or the NCNs are in inconsistent states then a cleanslate should be performed. basic wipe from Disk Cleanslate
REQUIRED For each NCN, excluding ncn-m001, login and wipe it (this step uses the basic wipe from Disk Cleanslate):
NOTE
Pending completion of CASMINST-1659, the auto-wipe is insufficient for masters and workers. All administrators must wipe their NCNs with this step.
Wipe NCN disks from LiveCD (pit
)
pit# ncns=$(grep Bond0 /etc/dnsmasq.d/statics.conf | grep -v m001 | awk -F',' '{print $6}')
pit# for h in $ncns; do
read -r -p "Are you sure you want to wipe the disks on $h? [y/N] " response
response=${response,,}
if [[ "$response" =~ ^(yes|y)$ ]]; then
ssh $h 'wipefs --all --force /dev/sd* /dev/disk/by-label/*'
fi
done
Wipe NCN disks from ncn-m001
ncn-m001# ncns=$(grep ncn /etc/hosts | grep nmn | grep -v m001 | awk '{print $3}')
ncn-m001# for h in $ncns; do
read -r -p "Are you sure you want to wipe the disks on $h? [y/N] " response
response=${response,,}
if [[ "$response" =~ ^(yes|y)$ ]]; then
ssh $h 'wipefs --all --force /dev/sd* /dev/disk/by-label/*'
fi
done
In either case, for disks which have no labels, no output will be shown. If one or more disks have labels, output similar to the following is expected:
...
Are you sure you want to wipe the disks on ncn-m003? [y/N] y
/dev/sda: 8 bytes were erased at offset 0x00000200 (gpt): 45 46 49 20 50 41 52 54
/dev/sda: 8 bytes were erased at offset 0x6fc86d5e00 (gpt): 45 46 49 20 50 41 52 54
/dev/sda: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa
/dev/sdb: 8 bytes were erased at offset 0x00000200 (gpt): 45 46 49 20 50 41 52 54
/dev/sdb: 8 bytes were erased at offset 0x6fc86d5e00 (gpt): 45 46 49 20 50 41 52 54
/dev/sdb: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa
/dev/sdc: 6 bytes were erased at offset 0x00000000 (crypto_LUKS): 4c 55 4b 53 ba be
/dev/sdc: 6 bytes were erased at offset 0x00004000 (crypto_LUKS): 53 4b 55 4c ba be
...
The thing to verify is that there are no error messages in the output.
Power each NCN off using ipmitool
from ncn-m001 (or the booted LiveCD if reinstalling an incomplete
install).
Shutdown from LiveCD (pit
)
pit# export username=root
pit# export IPMI_PASSWORD=changeme
pit# conman -q | grep mgmt | grep -v m001 | xargs -t -i ipmitool -I lanplus -U $username -E -H {} power off
Shutdown from ncn-m001
ncn-m001# export username=root
ncn-m001# export IPMI_PASSWORD=changeme
ncn-m001# grep ncn /etc/hosts | grep mgmt | grep -v m001 | sort -u | awk '{print $2}' | xargs -t -i ipmitool -I lanplus -U $username -E -H {} power off
3. Set the BMCs on the systems back to DHCP.
NOTE
During the install of the NCNs their BMCs get set to static IP addresses. The installation expects the that the NCN BMCs are set back to DHCP before proceeding.
If you have Intel nodes run:
# export lan=3
Otherwise run:
# export lan=1
from the LiveCD (pit
):
NOTE
This step uses the old statics.conf on the system in case CSI changes IPs:
pit# export username=root
pit# export IPMI_PASSWORD=changeme
pit# for h in $( grep mgmt /etc/dnsmasq.d/statics.conf | grep -v m001 | awk -F ',' '{print $2}' )
do
ipmitool -U $username -I lanplus -H $h -E lan set $lan ipsrc dhcp
done
The timing of this change can vary based on the hardware, so if the IP can no longer be reached after running the above command, run these commands.
pit# for h in $( grep mgmt /etc/dnsmasq.d/statics.conf | grep -v m001 | awk -F ',' '{print $2}' )
do
ipmitool -U $username -I lanplus -H $h -E lan print $lan | grep Source
done
pit# for h in $( grep mgmt /etc/dnsmasq.d/statics.conf | grep -v m001 | awk -F ',' '{print $2}' )
do
ipmitool -U $username -I lanplus -H $h -E mc reset cold
done
from ncn-m001:
NOTE
This step uses to the/etc/hosts
file on ncn-m001 to determine the IP addresses of the BMCs:
ncn-m001# export username=root
ncn-m001# export IPMI_PASSWORD=changeme
ncn-m001# for h in $( grep ncn /etc/hosts | grep mgmt | grep -v m001 | awk '{print $2}' )
do
ipmitool -U $username -I lanplus -H $h -E lan set $lan ipsrc dhcp
done
The timing of this change can vary based on the hardware, so if the IP can no longer be reached after running the above command, run these commands.
ncn-m001# for h in $( grep ncn /etc/hosts | grep mgmt | grep -v m001 | awk '{print $2}' )
do
ipmitool -U $username -I lanplus -H $h -E lan print $lan | grep Source
done
ncn-m001# for h in $( grep ncn /etc/hosts | grep mgmt | grep -v m001 | awk '{print $2}' )
do
ipmitool -U $username -I lanplus -H $h -E mc reset cold
done
Skip this step if
you are planning to use this node as a staging area to create the LiveCD. Lastly, shutdown the LiveCD or ncn-m001 node.
ncn-m001# poweroff
With the nodes off, the system is now ready for Starting an Installation.
After finishing the prerequisites an installation can be started one of two ways.
All installs may be done in full from a LiveCD of any supported medium.
Experimental
NOTICE
the remote ISO runs entirely in the systems volatile memory.
For installs using the remote mounted LiveCD (no USB stick), pay attention to memory usage as artifacts are downloaded and subsequently extracted. When RAM is limited to less than 128GB, memory pressure may occur from increasing file-system usage. For instances where memory is scarce, an NFS/CIF or HTTP/S share can be mounted in-place of the USB’s data partition at
/var/www/ephemeral
. Using the same mount point as the USB data partition will help ward off mistakes when following along.