NOTE: CSM-0.9.4 or later CSM 0.9.x is required in order to upgrade to CSM-1.0.1 (available with Shasta v1.5).
NOTE: Installed CSM versions may be listed from the product catalog using the following command. This will sort a semantic version without a hyphenated suffix after the same semantic version with a hyphenated suffix, e.g.
1.0.0
>1.0.0-beta.19
.
Use the following command can be used to check the CSM version on the system:
ncn# kubectl get cm -n services cray-product-catalog -o json | jq -r '.data.csm' | tee csm-version.txt
This check will also be conducted in the prerequisites.sh
script listed below and will fail if the system is not running CSM-0.9.4, CSM-0.9.5, or CSM-1.0.0.
IMPORTANT:
Before running any upgrade scripts, be sure the Cray CLI output format is reset to default by running the following command:
ncn# unset CRAY_FORMAT
Copy the latest document RPM package to /root
and install it.
The install scripts will look for this RPM in /root
, so it is important that you copy it there.
Internet Connected
ncn-m001# wget https://storage.googleapis.com/csm-release-public/shasta-1.5/docs-csm/docs-csm-latest.noarch.rpm -P /root
ncn-m001# rpm -Uvh --force /root/docs-csm-latest.noarch.rpm
Air Gapped (replace the PATH_TO below with the location of the rpm)
ncn-m001# cp [PATH_TO_docs-csm-*.noarch.rpm] /root
ncn-m001# rpm -Uvh --force /root/docs-csm-*.noarch.rpm
IMPORTANT
For TDS systems with only three worker nodes, prior to proceeding with this upgrade CPU limits MUST be lowered on several services in order for this upgrade to succeed. See TDS Lower CPU Requests for information on how to accomplish this.
customizations.yaml
Perform these steps to update customizations.yaml
:
Extract customizations.yaml
from the site-init
secret:
ncn-m001# cd /tmp
ncn-m001# kubectl -n loftsman get secret site-init -o jsonpath='{.data.customizations\.yaml}' | base64 -d - > customizations.yaml
Update customizations.yaml
:
ncn-m001# /usr/share/doc/csm/upgrade/1.0.1/scripts/upgrade/update-customizations.sh -i customizations.yaml
Update the site-init
secret:
ncn-m001# kubectl delete secret -n loftsman site-init
ncn-m001# kubectl create secret -n loftsman generic site-init --from-file=customizations.yaml
If using an external Git repository for managing customizations as recommended,
clone a local working tree and commit appropriate changes to customizations.yaml
.
For example:
ncn-m001# git clone <URL> site-init
ncn-m001# cp /tmp/customizations.yaml site-init
ncn-m001# cd site-init
ncn-m001# git add customizations.yaml
ncn-m001# git commit -m 'Remove Gitea PVC configuration from customizations.yaml'
ncn-m001# git push
Return to original working directory:
ncn-m001# cd -
IMPORTANT:
Reminder: Before running any upgrade scripts, be sure the Cray CLI output format is reset to default by running the following command:
ncn# unset CRAY_FORMAT
Authenticate with the Cray CLI on ncn-m001
.
See Configure the Cray Command Line Interface for details on how to do this.
Set the CSM_RELEASE
variable to the correct value for the CSM release upgrade being applied.
ncn-m001# CSM_RELEASE=csm-1.0.1
Run check script:
NOTE The prerequisites.sh
script will warn that it will unmount /mnt/pitdata
, but this is not accurate. The script will only unmount it if the script itself mounts it. That is, if it is mounted when the script begins, the script will not unmount it.
Option 1 - Internet Connected Environment
Set the ENDPOINT
variable to the URL of the directory containing the CSM release tarball.
In other words, the full URL to the CSM release tarball will be ${ENDPOINT}${CSM_RELEASE}.tar.gz
NOTE This step is optional for Cray/HPE internal installs.
ncn-m001# ENDPOINT=https://put.the/url/here/
Run the script
NOTE The --endpoint
argument is optional for Cray/HPE internal use.
ncn-m001# /usr/share/doc/csm/upgrade/1.0.1/scripts/upgrade/prerequisites.sh --csm-version $CSM_RELEASE --endpoint $ENDPOINT
Option 2 - Air Gapped Environment
Set the TAR_DIR
variable to the directory on ncn-m001
containing the CSM release tarball.
In other words, the full path to the CSM release tarball will be ${TAR_DIR}/${CSM_RELEASE}.tar.gz
ncn-m001# TAR_DIR=/path/to/tarball/dir
Run the script
ncn-m001# /usr/share/doc/csm/upgrade/1.0.1/scripts/upgrade/prerequisites.sh --csm-version $CSM_RELEASE --tarball-file ${TAR_DIR}/${CSM_RELEASE}.tar.gz
IMPORTANT:
If any errors are encountered, then potential fixes should be displayed where the error occurred.
IF the prerequisites.sh
script fails and does not provide guidance, then try rerunning it. If the failure persists, then open a support ticket for guidance before proceeding.
To prevent any possibility of losing configuration data, backup the VCS data and store it in a safe location. See Version_Control_Service_VCS.md for these procedures.
IMPORTANT:
As part of this stage, only perform the backup, not the restore. The backup procedure is being done here as a precautionary step.
To prevent any possibility of losing Workload Manager configuration data or files, a back-up is required. Please execute all Backup procedures (for the Workload Manager in use) located in the Troubleshooting and Administrative Tasks
sub-section of the Install a Workload Manager
section of the HPE Cray Programming Environment Installation Guide: CSM on HPE Cray EX
. The resulting back-up data should be stored in a safe location off of the system.
To prevent accidental storage cloud-init runs and also to ensure the Ceph services are set to auto-start on boot, please run the below script on ncn-m001
:
ncn-m001# python3 /usr/share/doc/csm/scripts/patch-ceph-runcmd.py
In the event of a problem during the upgrade which may cause the loss of BSS data, perform the following to preserve this data.
ncn-m001# cray bss bootparameters list --format=json >bss-backup-$(date +%Y-%m-%d).json
The resulting file needs to be saved in the event that BSS data needs to be restored in the future.
Once the above steps have been completed, proceed to Stage 1.