Cray System Management Documentation > Upgrade CSM > CSM 0.9.4 or later to 1.0.1 Upgrade Process > Stage 2 - Ceph image upgrade

Stage 2 - Ceph image upgrade

IMPORTANT:

Reminder: Before running any upgrade scripts, be sure the Cray CLI output format is reset to default by running the following command:
ncn# unset CRAY_FORMAT

In order for nodes to properly PXE boot, Border Gateway Protocol (BGP) must be healthy. Before proceeding, check the status of BGP as described in the Check BGP Status and Reset Sessions procedure.
Run ncn-upgrade-ceph-nodes.sh for ncn-s001. Follow output of the script carefully. The script will pause for manual interaction.
```
ncn-m001# /usr/share/doc/csm/upgrade/1.0.1/scripts/upgrade/ncn-upgrade-ceph-nodes.sh ncn-s001
```
NOTE: You may need to reset the root password for each node after it is rebooted
Repeat the previous step for each other storage node, one at a time.

After ncn-upgrade-ceph-nodes.sh has successfully run for all storage nodes, rescan SSH keys on all storage nodes by running the following commands on ncn-m001:

ncn-m001# grep -oP "(ncn-s\w+)" /etc/hosts |
            sort -u |
            xargs -t -i ssh {} 'truncate --size=0 ~/.ssh/known_hosts'
ncn-m001# grep -oP "(ncn-s\w+)" /etc/hosts |
            sort -u |
            xargs -t -i ssh {} 'grep -oP "(ncn-s\w+|ncn-m\w+|ncn-w\w+)" /etc/hosts |
                                sort -u |
                                xargs -t -i ssh-keyscan -H \{\} >> /root/.ssh/known_hosts'

Deploy node-exporter and alertmanager.

NOTE: This process will need to run on a node running ceph-mon, which in most cases will be ncn-s001, ncn-s002, and ncn-s003. It only needs to be run once, not on every one of these nodes.

Deploy node-exporter:

ncn-s# ceph orch apply node-exporter
Scheduled node-exporter update...

Deploy alertmanager:

ncn-s# ceph orch apply alertmanager
Scheduled alertmanager update...

Verify node-exporter is running:

IMPORTANT: There should be a node-exporter container per Ceph node

ncn-s# ceph orch ps --daemon_type node-exporter
NAME                    HOST      STATUS         REFRESHED  AGE  VERSION  IMAGE NAME                                       IMAGE ID           CONTAINER ID
node-exporter.ncn-s001  ncn-s001  running (57m)  3m ago     67m  0.18.1   docker.io/prom/node-exporter:v0.18.1             e5a616e4b9cf       3465eade21da
node-exporter.ncn-s002  ncn-s002  running (57m)  3m ago     67m  0.18.1   registry.local/prometheus/node-exporter:v0.18.1  e5a616e4b9cf       7ed9b6cc9991
node-exporter.ncn-s003  ncn-s003  running (57m)  3m ago     67m  0.18.1   registry.local/prometheus/node-exporter:v0.18.1  e5a616e4b9cf       1078d9e555e4

Verify alertmanager is running:

IMPORTANT: There should be a single alertmanager container for the cluster.

ncn-s# ceph orch ps --daemon_type alertmanager
NAME                   HOST      STATUS         REFRESHED  AGE  VERSION  IMAGE NAME                                      IMAGE ID           CONTAINER ID
alertmanager.ncn-s001  ncn-s001  running (66m)  3m ago     68m  0.20.0   registry.local/prometheus/alertmanager:v0.20.0  0881eb8f169f       775aa53f938f

Update BSS to ensure the Ceph images are loaded if a node is rebuilt.

ncn-m001# . /usr/share/doc/csm/upgrade/1.0.1/scripts/ceph/lib/update_bss_metadata.sh
ncn-m001# update_bss_storage

Ensure the ceph services are set to autostart by running the following script on ncn-s001:
```
ncn-s001# /srv/cray/scripts/common/ceph-enable-services.sh
```

Once Stage 2 is successfully completed, all the Ceph nodes have been rebooted into the new image. Now proceed to Stage 3.