This page details steps to take prior to starting new installation, these pages assume that all the NCNs have been deployed (e.g. there is no more PIT node).
NOTE
Skip this section if compute nodes and application nodes are not booted.
The application and compute nodes must be shutdown prior to a reinstallation. If they are left on, then they will potentially end up in an undesirable state.
See Shut Down and Power Off Compute and User Access Nodes.
NOTE
Skip this section if the CSM install was incomplete or not started.
The DHCP service running in Kubernetes needs to be disabled or it will conflict with the PIT’s DHCP services.
(ncn-mw#
) Disable cray-dhcp-kea
kubectl scale -n services --replicas=0 deployment cray-dhcp-kea
The upcoming procedures use ipmitool
. Set IPMI credentials for the BMCs of the NCNs.
read -s
is used in order to prevent the credentials from being displayed on the screen or recorded in the shell history.
(ncn#
or pit#
) Set the username and IPMI password:
username=root
read -r -s -p "NCN BMC ${username} password: " IPMI_PASSWORD
(ncn#
or pit#
) Export IPMI_PASSWORD
for the -E
option to work on ipmitool
:
export IPMI_PASSWORD
NOTE
Skip this section if none of the management nodes are booted.
Power each NCN off using ipmitool
from ncn-m001
(or the PIT node, if reinstalling an incomplete install).
Get the inventory of BMCs.
(pit#
) From the PIT:
readarray -t BMCS < <(conman -q | grep -v m001 | sort -u)
(ncn#
) From an NCN (e.g. ncn-m001
):
readarray BMCS < <(grep mgmt /etc/hosts | awk '{print $NF}' | grep -v m001 | sort -u)
(ncn#
or pit#
) Power off NCNs.
printf "%s\n" ${BMCS[@]} | xargs -t -i ipmitool -I lanplus -U "${username}" -E -H {} power off
(ncn#
or pit#
) Check the power status to confirm that the nodes have powered off.
printf "%s\n" ${BMCS[@]} | xargs -t -i ipmitool -I lanplus -U "${username}" -E -H {} power status
During the install of the management nodes, their BMCs get set to static IP addresses. The installation expects these BMCs to be set back to DHCP before proceeding.
NOTE
These steps require that the Set IPMI credentials steps have been performed.
Get the inventory of BMCs.
(pit#
) From the PIT:
readarray -t BMCS < <(conman -q | grep -v m001 | sort -u)
(ncn#
) From an NCN (e.g. ncn-m001
):
readarray BMCS < <(grep mgmt /etc/hosts | awk '{print $NF}' | grep -v m001 | sort -u)
Set the BMCs to DHCP.
function bmcs_set_dhcp {
local lan=1
for bmc in ${BMCS[@]}; do
# by default the LAN for the BMC is lan channel 1, except on Intel systems.
if ipmitool -I lanplus -U "${username}" -E -H "${bmc}" lan print 3 2>/dev/null; then
lan=3
fi
printf "Setting %s to DHCP ... " "${bmc}"
if ipmitool -I lanplus -U "${username}" -E -H "${bmc}" lan set "${lan}" ipsrc dhcp; then
echo "Done"
else
echo "Failed!"
fi
done
}
bmcs_set_dhcp
Perform a cold reset of any BMCs which are still reachable.
function bmcs_cold_reset {
for bmc in ${BMCS[@]}; do
printf "Setting %s to DHCP ... " "${bmc}"
if ipmitool -I lanplus -U "${username}" -E -H "${bmc}" mc reset cold; then
echo "Done"
else
echo "Failed!"
fi
done
}
bmcs_cold_reset
NOTE
As long as every BMC is either not reachable or receives a cold reset, this step is successful.
NOTE
Skip this step if planning to use this node as a staging area to create the USB LiveCD.
(ncn-m001#
or pit#
) Shut down the LiveCD or ncn-m001
node.
poweroff
The process is now done; the NCNs are ready for a new deployment.
If ncn-m001
is being used to prepare the USB LiveCD, then remove the Kubernetes IP addresses from /etc/resolv.conf
and add a
valid external DNS server.
If ncn-m001
is being used to prepare the USB LiveCD, then ensure that there is enough free disk space for the CSM tar archive to be
downloaded and unpacked.