Cray System Management Documentation > Upgrade CSM > CSM 1.0.11 CVE Patch/Upgrade Procedure > Stage 1 - Ceph Image Upgrade

Stage 1 - Ceph Image Upgrade

Procedure

IMPORTANT:

Reminder: Before running any upgrade scripts, be sure the Cray CLI output format is reset to default by running the following command:
ncn# unset CRAY_FORMAT

In order for nodes to properly PXE boot, Border Gateway Protocol (BGP) must be healthy. Before proceeding, check the status of BGP as described in the Check BGP Status and Reset Sessions procedure.
Run ncn-upgrade-ceph-nodes.sh for ncn-s001. Follow output of the script carefully. The script will pause for manual interaction.
```
ncn-m001# /usr/share/doc/csm/upgrade/1.0.11/scripts/upgrade/ncn-upgrade-ceph-nodes.sh ncn-s001
```
NOTE: The root password for the node may need to be reset after it is rebooted.
Repeat the previous step for each other storage node, one at a time.

After ncn-upgrade-ceph-nodes.sh has successfully run for all storage nodes, rescan SSH keys on all storage nodes

ncn-m001# grep -oP "(ncn-s\w+)" /etc/hosts | sort -u | xargs -t -i ssh {} 'truncate --size=0 ~/.ssh/known_hosts'
ncn-m001# grep -oP "(ncn-s\w+)" /etc/hosts | sort -u | xargs -t -i ssh {} 'grep -oP "(ncn-s\w+|ncn-m\w+|ncn-w\w+)" /etc/hosts | sort -u | xargs -t -i ssh-keyscan -H \{\} >> /root/.ssh/known_hosts'

Deploy node-exporter and alertmanager.

NOTE: This procedure must run on a node running ceph-mon, which in most cases will be ncn-s001, ncn-s002, and ncn-s003. It only needs to be run once, not on every one of these nodes.

Deploy node-exporter and alertmanager.

ncn-s# ceph orch apply node-exporter && ceph orch apply alertmanager

Expected output looks similar to the following:

Scheduled node-exporter update...
Scheduled alertmanager update...

Verify that node-exporter is running.

IMPORTANT: There should be one node-exporter container per Ceph node.

ncn-s# ceph orch ps --daemon_type node-exporter

Expected output on a system with three Ceph nodes should look similar to the following:

NAME                    HOST      STATUS         REFRESHED  AGE  VERSION  IMAGE NAME                                       IMAGE ID           CONTAINER ID
node-exporter.ncn-s001  ncn-s001  running (57m)  3m ago     67m  0.18.1   docker.io/prom/node-exporter:v0.18.1             e5a616e4b9cf       3465eade21da
node-exporter.ncn-s002  ncn-s002  running (57m)  3m ago     67m  0.18.1   registry.local/prometheus/node-exporter:v0.18.1  e5a616e4b9cf       7ed9b6cc9991
node-exporter.ncn-s003  ncn-s003  running (57m)  3m ago     67m  0.18.1   registry.local/prometheus/node-exporter:v0.18.1  e5a616e4b9cf       1078d9e555e4

Verify that alertmanager is running.

IMPORTANT: There should be a single alertmanager container for the cluster.

ncn-s# ceph orch ps --daemon_type alertmanager

Expected output looks similar to the following:

NAME                   HOST      STATUS         REFRESHED  AGE  VERSION  IMAGE NAME                                      IMAGE ID           CONTAINER ID
alertmanager.ncn-s001  ncn-s001  running (66m)  3m ago     68m  0.20.0   registry.local/prometheus/alertmanager:v0.20.0  0881eb8f169f       775aa53f938f

Update BSS to ensure that the Ceph images are loaded if a node is rebuilt.

ncn-m001# . /usr/share/doc/csm/upgrade/1.0.11/scripts/ceph/lib/update_bss_metadata.sh
ncn-m001# update_bss_storage

Ensure the Ceph services are set to automatically start.

On ncn-s001:
```
ncn-s001# /srv/cray/scripts/common/ceph-enable-services.sh
```

Stage completed

All the Ceph nodes have been rebooted into the new image.

This stage is completed. Continue to Stage 2.