barebones
image
test may be skipped if all compute nodes are active running application workloads.(ncn-m002#
) If a typescript session is already running in the shell, then first stop it with the exit
command.
(ncn-m002#
) Start a typescript.
script -af /root/csm_upgrade.$(date +%Y%m%d_%H%M%S).post_upgrade_health_validation.txt
export PS1='\u@\H \D{%Y-%m-%d} \t \w # '
If additional shells are opened during this procedure, then record those with typescripts as well. When resuming a procedure after a break, always be sure that a typescript is running before proceeding.
(ncn-m002#
) Validate CSM health.
Run the combined health check script, which runs a variety of health checks that should pass at this stage of the upgrade.
/opt/cray/tests/install/ncn/automated/ncn-k8s-combined-healthcheck
Review the output and follow the instructions provided to resolve any test failures. With the exception of Known issues with NCN health checks, all health checks are expected to pass.
(ncn-m002#
) Stop typescripts.
For any typescripts that were started during the health validation procedure, stop them with the exit
command.
(ncn-m002#
) Backup upgrade logs and typescript files to a safe location.
If any typescript files are on different NCNs, then copy them to /root
on ncn-m002
.
Create a tar
file containing the logs and typescript files.
If any typescript file names are not of the form
csm_upgrade.*.txt
, then append their names to the followingtar
command in order to include them.
TARFILE="csm_upgrade.$(date +%Y%m%d_%H%M%S).logs.tgz"
tar -czvf "/root/${TARFILE}" /root/csm_upgrade.*.txt /root/output.log
Upload the tar
file into S3.
This step requires that the Cray Command Line Interface is configured on the node. This should have already
been done on ncn-m002
during the upgrade process. If needed, see Configure the Cray CLI.
cray artifacts create config-data "${TARFILE}" "/root/${TARFILE}"
Update ceph node-exporter config for SNMP counters.
OPTIONAL: This is an optional step.
This uses netstat
collector form node-exporter and enables all the SNMP counters monitoring in /proc/net/snmp
on ncn
nodes.
See Update ceph node-exporter configuration to update the ceph node-exporter configuration to monitor SNMP counters.