Nexus Service Recovery

The following covers redeploying the Nexus service and restoring the data.

Prerequisites

  • The system is fully installed and has transitioned off of the LiveCD.
  • All activities required for site maintenance are complete.
  • A backup or export of the data already exists.
  • The latest CSM documentation has been installed on the master nodes. See Check for Latest Documentation.

Service recovery for Nexus

  1. (ncn-mw#) Verify that an export of the Nexus PVC data exists.

    1. Verify that the nexus-bak PVC exists.

      kubectl get pvc -n nexus nexus-bak
      

      Example output:

      NAME         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS           AGE
      nexus-bak    Bound    pvc-f058bf3b-97c0-4d7e-ab60-7294eaa18788   1000Gi     RWX            ceph-cephfs-external   6d
      
  2. (ncn-mw#) Download the cray-nexus chart from Nexus.

    This step can only be performed if Nexus is available and healthy. If Nexus is unavailable it will be necessary to download the CSM release tarball and extract the Helm chart from it.

    1. Configure Nexus as a Helm repository.

      helm repo add nexus https://packages.local/repository/charts
      

      Example output:

      "nexus" has been added to your repositories
      
    2. Create a directory for the Helm chart.

      mkdir helm
      
    3. Download the cray-nexus chart.

      helm pull nexus/cray-nexus -d ./helm
      

      This command produces no output however the chart tar file should exist in the helm directory.

      Example output:

      cray-nexus-0.11.1.tgz
      
  3. (ncn-mw#) Uninstall the chart and wait for the resources to terminate.

    1. Note the version of the chart that is currently deployed.

      helm history -n nexus cray-nexus
      

      Example output:

      REVISION    UPDATED                     STATUS      CHART               APP VERSION DESCRIPTION
      1           Tue Aug  2 22:14:31 2022    deployed    cray-nexus-0.6.0    3.25.0      Install complete
      
    2. Uninstall the chart.

      helm uninstall -n nexus cray-nexus
      

      Example output:

      release "cray-nexus" uninstalled
      
    3. Wait for the resources to terminate and delete the nexus-data PVC if it still exists. Do not delete the nexus-bak PVC.

      1. Verify that no Nexus pods are running.

        watch "kubectl get pods -n nexus -l app=nexus"
        

        Example output:

        No resources found in nexus namespace.
        
      2. Delete the nexus-data PVC if it still exists.

        kubectl delete pvc nexus-data -n nexus 
        

        Example output:

        persistentvolumeclaim "nexus-data" deleted
        
  4. (ncn-mw#) Follow the Redeploying a Chart procedure with the following specifications:

    • Name of chart to be redeployed: cray-nexus

    • Base name of manifest: nexus

    • Chart files are located in the ./helm directory.

    • When reaching the step to update customizations, no edits need to be made to the customizations file.

    • When running the loftsman ship command, use the following command:

      loftsman ship --charts-path ./helm --manifest-path "${CUSTOMIZED_CHART_FILE}"
      
    • When reaching the step to validate that the redeploy was successful, perform the following step:

      Only follow this step as part of the previously linked chart redeploy procedure.

      Wait for the resources to start.

      watch "kubectl get pods -n nexus -l app=nexus"
      

      Example output:

      NAME                     READY   STATUS    RESTARTS   AGE
      nexus-7f79cd64c8-7j8tb   2/2     Running   0          30m
      
  5. (ncn-mw#) Restore the critical data.

    See Nexus Export and Restore.