Check and set the time for Gigabyte nodes.
If the console log indicates the time between the rest of the system and the compute nodes is off by several hours, then it prevents the spire-agent
from getting a valid certificate,
which causes the node boot to drop into the dracut
emergency shell.
(ncn-mw#
) Retrieve the cray-console-operator
pod ID.
CONPOD=$(kubectl get pods -n services -o wide|grep cray-console-operator|awk '{print $1}'); echo ${CONPOD}
Example output:
cray-console-operator-79bf95964-qpcpp
The following steps should be repeated for each Gigabyte node which needs to have its BIOS time reset.
(ncn-mw#
) Set the XNAME
variable to the component name (xname) of the node whose console you wish to open.
XNAME=x3001c0s24b1n0
(ncn-mw#
) Find the cray-console-node
pod that is connected to that node.
NODEPOD=$(kubectl -n services exec "${CONPOD}" -c cray-console-operator -- \
sh -c "/app/get-node ${XNAME}" | jq .podname | sed 's/"//g') ; echo ${NODEPOD}
Example output:
cray-console-node-1
(ncn-mw#
) Connect to the node’s console using ConMan on the identified cray-console-node
pod.
kubectl exec -it -n services "${NODEPOD}" -- conman -j "${XNAME}"
Example output:
<ConMan> Connection to console [x3001c0s24b1] opened.
(ncn-mw#
) In another terminal, boot the node to BIOS.
Set the BMC
variable to the component name (xname) of the BMC for the node.
This value will be different for each node.
BMC=x3001c0s24b1
Boot the node to BIOS.
read -s
is used to prevent the password from being written to the screen or the shell history.
USERNAME=root
read -r -s -p "$BMC ${USERNAME} password: " IPMI_PASSWORD
export IPMI_PASSWORD
ipmitool -I lanplus -U "${USERNAME}" -E -H "${BMC}" chassis bootdev bios
ipmitool -I lanplus -U "${USERNAME}" -E -H "${BMC}" chassis power off
sleep 10
ipmitool -I lanplus -U "${USERNAME}" -E -H "${BMC}" chassis power on
(ncn-mw#
) Update the System Date
field to match the time on the system.
Use the terminal which is watching the console for this step. As the node powers on, it will complete POST (Power On Self Test) and then display the BIOS menu.
The System Date
field is located under the Main
tab in the navigation bar.
Enter the F10
key followed by the Enter
key to save the BIOS time.
Exit the connection to the console by entering &.
.
Repeat the above steps for other nodes which need their BIOS time reset.