Modifying a Tenant


This page provides information about how to modify a tenant. Modifications that are supported are limited to:

  • Updating the list of component names (xnames) assigned to this tenant.
  • Adding/deleting childNamespaces.

Modify the existing TAPMS CR

An example of a change to add a component name (xname) to a tenant:

kind: Tenant
  name: vcluster-blue
    - user
    - slurm
  tenantname: vcluster-blue
    - type: compute
      hsmgrouplabel: blue
      enforceexclusivehsmgroups: true
        - x0c3s5b0n0
        - x0c3s6b0n0
        - x0c3s7b0n0

Apply the modified TAPMS CR

  1. (ncn-mw#) When a tenant CRD is applied, tapms will determine any changes to the tenant, and reconcile any changes to childNamespaces and xnames.

    kubectl apply -n tenants -f <tenant.yaml>

    Example output: configured
  2. (ncn-mw#) It can take up to a minute for tapms to reconcile the change. The following command can be used to monitor the status of the tenant:

    kubectl get tenant -n tenants vcluster-blue -o yaml

    Example output:

    kind: Tenant
      annotations: |
      creationTimestamp: "2022-08-23T18:37:25Z"
      generation: 3
      name: vcluster-blue
      namespace: tenants
      resourceVersion: "3157072"
      uid: 074b6db1-f504-4e9c-8245-259e9b22d2e6
      - user
      - slurm
      state: Deployed
      tenantname: vcluster-blue
      - enforceexclusivehsmgroups: true
        hsmgrouplabel: blue
        type: compute
        - x0c3s5b0n0
        - x0c3s6b0n0
        - x0c3s7b0n0
      - vcluster-blue-user
      - vcluster-blue-slurm
      - enforceexclusivehsmgroups: true
        hsmgrouplabel: blue
        type: compute
        - x0c3s5b0n0
        - x0c3s6b0n0
        - x0c3s7b0n0
      uuid: 074b6db1-f504-4e9c-8245-259e9b22d2e6
  3. (ncn-mw#) The cray command can now be used to display changes to the HSM group:

    cray hsm groups describe blue --format toml

    Example output:

    label = "blue"
    description = ""
    exclusiveGroup = "tapms-exclusive-group-label"
    tags = [ "vcluster-blue",]
    ids = [ "x0c3s5b0n0", "x0c3s6b0n0", "x0c3s7b0n0"]

Modify the slurm operator CR

To make changes to a Slurm tenant deployment, first update the Slurm custom resource file. The Slurm operator will attempt to reconcile the following changes:

  • Changing the munge key length.
  • Changing the slurmctld PVC.
  • Changing the Percona XtraDB, slurmctld, or slurmdbd deployments.

Apply the slurm operator CR

(ncn-mw#) To update the Slurm custom resource, apply the changed file:

kubectl apply -f <cluster>.yaml

Once the custom resource has been updated, the Slurm operator will attempt to update the relevant Kubernetes resources to reflect the changes.

Modify the Slurm configuration

If the list of nodes assigned to a tenant changes, the Slurm configuration must be updated by rerunning the Kubernetes job that generates it. This procedure can also be used to change Slurm configuration settings, or add new Slurm configuration files.

  1. (ncn-mw#) Get the current configuration.

    kubectl get configmap -n slurm-operator <cluster>-config-templates -o yaml >slurm-config-templates.yaml
  2. (ncn-mw#) Make desired edits.

    • To edit an existing configuration file (for example, slurm.conf):

      yq r slurm-config-templates.yaml 'data."slurm.conf"' >slurm.conf
      # Edit slurm.conf
      yq w -i slurm-config-templates.yaml 'data."slurm.conf"' "$(cat slurm.conf)"
    • To add a new configuration file (for example, topology.conf):

      yq w -i slurm-config-templates.yaml 'data."topology.conf"' "$(cat topology.conf)"
  3. (ncn-mw#) Apply changes to the configuration templates.

    kubectl apply -f slurm-config-templates.yaml
  4. (ncn-mw#) Rerun the Slurm configuration job.

    kubectl get job -n slurm-operator <cluster>-slurm-config -o yaml >slurm-config.yaml
    yq d -i slurm-config.yaml spec.template.metadata
    yq d -i slurm-config.yaml spec.selector
    kubectl delete -f slurm-config.yaml
    kubectl create -f slurm-config.yaml
  5. (ncn-mw#) Once the job has completed, reconfigure Slurm.

    SLURMCTLD_POD=$(kubectl get pod -n <namespace> -o jsonpath='{.items[0]}')
    kubectl exec -n <namespace> ${SLURMCTLD_POD} -c slurmctld -- scontrol reconfigure

Update component records in BOS

Any nodes that are moving from one tenant to another should have their records cleared in BOS. Clearing the desired state will have the secondary effect of causing BOS to shutdown the node. This prevents a tenant using a node that is booted with another tenant’s boot artifact or configuration.

This procedure requires the Cray CLI to be configured on the node where it is being performed. See Configure the Cray CLI.

  1. (ncn-mw#) Clear the desired state.

    Repeat this step for each component that moving from one tenant to another.

    cray bos v2 components update <xname> --enabled true --staged-state-session "" --staged-state-configuration "" \
        --staged-state-boot-artifacts-initrd "" --staged-state-boot-artifacts-kernel-parameters "" --staged-state-boot-artifacts-kernel "" \
        --desired-state-bss-token "" --desired-state-configuration "" --desired-state-boot-artifacts-initrd "" \
        --desired-state-boot-artifacts-kernel-parameters "" --desired-state-boot-artifacts-kernel ""
  2. (ncn-mw#) Wait until the component reaches a stable state.

    Because the previous step cleared the desired state, the stable state indicates that the component is powered off.

    cray bos v2 components describe <xname> --format json | jq .status.status