Use this procedure to finish validating the success of rebuilt NCNs.
Confirm what the Configuration Framework Service (CFS) configuration_status is for the desired_config after rebooting the node.
NOTE The following command will indicate if a CFS job is currently in progress for this node.
IMPORTANT: The following command assumes that the variables from the prerequisites section have been set.
cray cfs v3 components describe "${XNAME}" --format json | jq .configuration_status
Example output:
"configured"
If the configuration_status is pending, wait for the job to finish before continuing. If the configuration_status is failed, this means the failed CFS job
configuration_status should be addressed now for this node. If the configuration_status is unconfigured and the NCN personalization procedure has not been done
as part of an install yet, this can be ignored.
If configuration_status is failed, then see
Troubleshoot Failed CFS Sessions for how to analyze the pod logs
from cray-cfs in order to determine why the configuration may not have completed.
Collect data about the system management platform health (can be run from a master or worker NCN).
/opt/cray/platform-utils/ncnHealthChecks.sh
/opt/cray/platform-utils/ncnPostgresHealthChecks.sh
Return to the main Rebuild NCNs page.