Use this procedure to check if Kubernetes pods get scheduled on an NCN, but do not eventually reach the Running state.
The kubectl get pod command returns pods that seem to be stuck in the Init or ContainerCreating state.
Run the kubectl get pod -o wide command to identify the node where the pod is not starting.
kubectl get pod -A -o wide | egrep 'Init|ContainerCreating'
services cray-sls-58cfdb7c46-b7dbj 0/2 Init:0/2 0 2d22h 10.39.0.165 ncn-w002 <none> <none>
services gitea-vcs-65c98746b-jk5v7 0/2 ContainerCreating 0 2d3h 10.47.0.104 ncn-w002 <none> <none>
In the above example, ncn-w002 is the node that may need attention.
Execute the following steps on the node that was determined in the previous step.
Restart the kubelet service.
systemctl restart kubelet
Ensure that kubelet is running.
systemctl status kubelet
Restart the containerd service.
systemctl restart containerd
Ensure that containerd is running.
systemctl status containerd
Try running the kubectl get pod command again; within a few minutes, the pods should transition to the Running state.