Prepare Site Init

These procedures guide administrators through setting up the site-init directory which contains important customizations for various products.

Note: There are two media available to bootstrap the PIT node–the RemoteISO or a bootable USB device. Both of those can use this procedure. The only difference in this procedure is that the RemoteISO method will execute these commands on the PIT node with command prompt pit# while the USB method could be done on any Linux system with the command prompt linux#. This procedure works for both methods, but has the command prompt of linux#, even though it would normally be pit# for the RemoteISO method.

Topics

  1. Background
  2. Create and Initialize Site Init Directory
  3. Create Baseline System Customizations
  4. Generate Sealed Secrets
  5. Version Control Site Init Files
    1. Push to a Remote Repository
  6. Patch cloud-init with the CA
  7. Customer-Specific Customizations

Details

1. Background

The shasta-cfg directory included in CSM includes relatively static, installation-centric artifacts such as:

  • Cluster-wide network configuration settings required by Helm Charts deployed by product stream Loftsman Manifests
  • Sealed Secrets
  • Sealed Secret Generate Blocks – a form of plain-text input that renders to a Sealed Secret
  • Helm Chart value overrides that are merged into Loftsman Manifests by product stream installers

2. Create and Initialize Site Init Directory

  1. Create directory /mnt/pitdata/prep/site-init.

    linux# mkdir -pv /mnt/pitdata/prep/site-init
    linux# cd /mnt/pitdata/prep/site-init
    
  2. Set CSM_RELEASE and SYSTEM_NAME variables, if not already set.

    linux# CSM_RELEASE=csm-x.y.z
    linux# SYSTEM_NAME=eniac
    
  3. Initialize /mnt/pitdata/prep/site-init from CSM.

    linux# /mnt/pitdata/${CSM_RELEASE}/shasta-cfg/meta/init.sh /mnt/pitdata/prep/site-init
    

    IMPORTANT The output of this command states that customizations.yaml should be reviewed and updated before running the secrets-reencrypt.sh and secrets-seed-customizations.sh scripts. These two scripts will be run later in this document; do not run them at this time.

  4. The yq tool used in the following procedures is available under /mnt/pitdata/prep/site-init/utils/bin once the SHASTA-CFG repo has been cloned

    linux# alias yq="/mnt/pitdata/${CSM_RELEASE}/shasta-cfg/utils/bin/$(uname | awk '{print tolower($0)}')/yq"
    

3. Create Baseline System Customizations

The following steps update /mnt/pitdata/prep/site-init/customizations.yaml with system-specific customizations.

  1. Merge the system-specific settings generated by CSI into customizations.yaml.

    linux# yq merge -xP -i /mnt/pitdata/prep/site-init/customizations.yaml <(yq prefix -P "/mnt/pitdata/prep/${SYSTEM_NAME}/customizations.yaml" spec)
    
  2. Set the cluster name.

    linux# yq write -i /mnt/pitdata/prep/site-init/customizations.yaml spec.wlm.cluster_name "$SYSTEM_NAME"
    
  3. Make a backup copy of /mnt/pitdata/prep/site-init/customizations.yaml.

    linux# cp -v /mnt/pitdata/prep/site-init/customizations.yaml /mnt/pitdata/prep/site-init/customizations.yaml.prepassword
    
  4. Review the configuration to generate these sealed secrets in customizations.yaml in the site-init directory.

    • spec.kubernetes.sealed_secrets.cray_reds_credentials
    • spec.kubernetes.sealed_secrets.cray_meds_credentials
    • spec.kubernetes.sealed_secrets.cray_hms_rts_credentials

    The cray_reds_credentials are used by the HMS Discovery CronJob and the River Endpoint Discovery Service (REDS) for River components. This sealed secret contains the following:

    • Default Redfish root user credentials for air-cooled Node and Router BMCs.
    • Default SNMP credentials configured on leaf-bmc switches. This needs to match the SNMP credentials currently configured on leaf-bmc switches or the credentials will be configured later in the install. The SNMP username, authentication password, and the privacy password need to be provided.

    The cray_meds_credentials are used by the Mountain Endpoint Discovery Service (MEDS) for the liquid-cooled components in an Olympus (Mountain) cabinet.

    • The root user password needs to match what is currently configured on the CECs for the liquid-cooled cabinets in the system.

    The cray_hms_rts_credentials are used by the Redfish Translation Service (RTS) for any hardware components which are not managed by Redfish, such as a ServerTech PDU in a River Cabinet.

    • The ServerTech PDU credentials need to match the credentials for the admn user that is currently configured on the PDUs.
    • The RTS root user credentials are unique to RTS and used only within the service mesh; these can be set as desired.

    Edit customizations.yaml to replace the Username and Password references in the file so that the values match the existing settings of your system hardware components. See the Decrypt Sealed Secrets for Review section of Manage Sealed Secrets if needing to examine credentials from prior installs.

  5. Review the changes.

    linux# diff /mnt/pitdata/prep/site-init/customizations.yaml /mnt/pitdata/prep/site-init/customizations.yaml.prepassword
    
  6. Validate that REDS/MEDS/RTS sealed secrets contain valid JSON using jq.

    1. Validate REDS credentials.

      These credentials are used by the REDS and HMS discovery services, targeting River Redfish BMC endpoints and management switches.

      linux# yq read /mnt/pitdata/prep/site-init/customizations.yaml 'spec.kubernetes.sealed_secrets.cray_reds_credentials.generate.data[0].args.value' | jq
      linux# yq read /mnt/pitdata/prep/site-init/customizations.yaml 'spec.kubernetes.sealed_secrets.cray_reds_credentials.generate.data[1].args.value' | jq
      

      NOTE: For vault_redfish_defaults, the only entry used is '{"Cray": {"Username": "root", "Password": "XXXX"}' Make sure it is specified as shown, with the Cray key. This key is not used in any of the other credential specifications. Make sure that Username and Password entries are correct.

    2. Validate MEDS credentials.

      These credentials are used by the MEDS service, targeting Redfish BMC endpoints. Make sure that Username and Password entries are correct.

      linux# yq read /mnt/pitdata/prep/site-init/customizations.yaml 'spec.kubernetes.sealed_secrets.cray_meds_credentials.generate.data[0].args.value' | jq
      
    3. Validate RTS credentials.

      These credentials are used by the Redfish Translation Service, targeting River Redfish BMC endpoints and PDU controllers. Make sure that Username and Password entries are correct.

      linux# yq read /mnt/pitdata/prep/site-init/customizations.yaml 'spec.kubernetes.sealed_secrets.cray_hms_rts_credentials.generate.data[0].args.value' | jq
      linux# yq read /mnt/pitdata/prep/site-init/customizations.yaml 'spec.kubernetes.sealed_secrets.cray_hms_rts_credentials.generate.data[1].args.value' | jq
      
  7. To customize the PKI Certificate Authority (CA) used by the platform, see Certificate Authority.

    IMPORTANT The CA may not be modified after install.

  8. Federate Keycloak with an upstream LDAP server.

    1. Set environment variables for LDAP and PORT.

      In the example below, the LDAP server has the hostname dcldap2.us.cray.com and is using the port 636.

      linux# export LDAP=dcldap2.us.cray.com
      linux# PORT=636
      
    2. If LDAP requires TLS (recommended), update the cray-keycloak sealed secret value by supplying a base64 encoded Java KeyStore (JKS) that contains the CA certificate that signed the LDAP server’s host key. The password for the JKS file must be password. Administrators may use the keytool command from the openjdk:11-jre-slim container image packaged with CSM to create a JKS file that includes a PEM-encoded CA certificate to verify the LDAP host(s) as follows:

      This step builds an example that will create (or update) cert.jks with the PEM-encoded CA certificate for an LDAP host and then prepares certs.jks.b64 which will be injected into customizations.yaml.

      1. Load the openjdk container image.

        NOTE Requires a properly configured Docker or Podman environment.

        If this step fails with an “overlay is not supported” error, run the following command and then re-run the load-container-image.sh script:

        sed -i 's/skopeo_dest=.*/skopeo_dest="${transport}:${image}"/' /mnt/pitdata/${CSM_RELEASE}/hack/load-container-image.sh

        linux# /mnt/pitdata/${CSM_RELEASE}/hack/load-container-image.sh dtr.dev.cray.com/library/openjdk:11-jre-slim
        
      2. Get the issuer certificate for the LDAP server at port 636. Use openssl s_client to connect and show the certificate chain returned by the LDAP host.

        linux# openssl s_client -showcerts -connect $LDAP:${PORT} </dev/null
        
      3. Either manually extract (i.e., cut/paste) the issuer’s certificate into cacert.pem or try the following commands to create it automatically.

        NOTE The following commands were verified using OpenSSL version 1.1.1d and use the -nameopt RFC2253 option to ensure consistent formatting of distinguished names (DNs). Unfortunately, older versions of OpenSSL may not support -nameopt on the s_client command or may use a different default format. As a result, your mileage may vary; however, you should be able to extract the issuer certificate manually from the output of the above openssl s_client example if the following commands are unsuccessful.

        1. Observe the issuer’s DN.

          linux# openssl s_client -showcerts -nameopt RFC2253 -connect $LDAP:${PORT} </dev/null 2>/dev/null | grep issuer= | sed -e 's/^issuer=//'
          

          Expected output would include a line similar to this:

          emailAddress=dcops@hpe.com,CN=Data Center,OU=HPC/MCS,O=HPE,ST=WI,C=US
          
        2. Extract the issuer’s certificate.

          NOTE The issuer DN is properly escaped as part of the awk pattern below. It must be changed to match the value for emailAddress, CN, OU, etc. for your LDAP. If the value you are using is different, be sure to escape it properly!

          linux# openssl s_client -showcerts -nameopt RFC2253 -connect $LDAP:${PORT} </dev/null 2>/dev/null | \
                    awk '/s:emailAddress=dcops@hpe.com,CN=Data Center,OU=HPC\/MCS,O=HPE,ST=WI,C=US/,/END CERTIFICATE/' | \
                    awk '/BEGIN CERTIFICATE/,/END CERTIFICATE/' > cacert.pem
          
      4. Verify issuer’s certificate was properly extracted and saved in cacert.pem.

        linux# cat cacert.pem
        

        Expected output looks like:

        -----BEGIN CERTIFICATE-----
        MIIDvTCCAqWgAwIBAgIUYxrG/PrMcmIzDuJ+U1Gh8hpsU8cwDQYJKoZIhvcNAQEL
        BQAwbjELMAkGA1UEBhMCVVMxCzAJBgNVBAgMAldJMQwwCgYDVQQKDANIUEUxEDAO
        BgNVBAsMB0hQQy9NQ1MxFDASBgNVBAMMC0RhdGEgQ2VudGVyMRwwGgYJKoZIhvcN
        AQkBFg1kY29wc0BocGUuY29tMB4XDTIwMTEyNDIwMzM0MVoXDTMwMTEyMjIwMzM0
        MVowbjELMAkGA1UEBhMCVVMxCzAJBgNVBAgMAldJMQwwCgYDVQQKDANIUEUxEDAO
        BgNVBAsMB0hQQy9NQ1MxFDASBgNVBAMMC0RhdGEgQ2VudGVyMRwwGgYJKoZIhvcN
        AQkBFg1kY29wc0BocGUuY29tMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKC
        AQEAuBIZkKitHHVQHymtaQt4D8ZhG4qNJ0cTsLhODPMtVtBjPZp59e+PWzbc9Rj5
        +wfjLGteK6/fNJsJctWlS/ar4jw/xBIPMk5pg0dnkMT2s7lkSCmyd9Uib7u6y6E8
        yeGoGcb7I+4ZI+E3FQV7zPact6b17xmajNyKrzhBGEjYucYJUL5iTgZ6a7HOZU2O
        aQSXe7ctiHBxe7p7RhHCuKRrqJnxoohakloKwgHHzDLFQzX/5ADp1hdJcduWpaXY
        RMBu6b1mhmwo5vmc+fDnfUpl5/X4i109r9VN7JC7DQ5+JX8u9SHDGLggBWkrhpvl
        bNXMVCnwnSFfb/rnmGO7rdJSpwIDAQABo1MwUTAdBgNVHQ4EFgQUVg3VYExUAdn2
        WE3e8Xc8HONy/+4wHwYDVR0jBBgwFoAUVg3VYExUAdn2WE3e8Xc8HONy/+4wDwYD
        VR0TAQH/BAUwAwEB/zANBgkqhkiG9w0BAQsFAAOCAQEAWLDQLB6rrmK+gwUY+4B7
        0USbQK0JkLWuc0tCfjTxNQTzFb75PeH+GH21QsjUI8VC6QOAAJ4uzIEV85VpOQPp
        qjz+LI/Ej1xXfz5ostZQu9rCMnPtVu7JT0B+NV7HvgqidTfa2M2dw9yUYS2surZO
        8S0Dq3Bi6IEhtGU3T8ZpbAmAp+nNsaJWdUNjD4ECO5rAkyA/Vu+WyMz6F3ZDBmRr
        ipWM1B16vx8rSpQpygY+FNX4e1RqslKhoyuzXfUGzyXux5yhs/ufOaqORCw3rJIx
        v4sTWGsSBLXDsFM3lBgljSAHfmDuKdO+Qv7EqGzCRMpgSciZihnbQoRrPZkOHUxr
        NA==
        -----END CERTIFICATE-----
        
      5. Create certs.jks.

        NOTE The alias used in this command for cray-data-center-ca should be changed to match your LDAP.

        linux# podman run --rm -v "$(pwd):/data" dtr.dev.cray.com/library/openjdk:11-jre-slim keytool -importcert \
        -trustcacerts -file /data/cacert.pem -alias cray-data-center-ca -keystore /data/certs.jks \
        -storepass password -noprompt
        
      6. Create certs.jks.b64 by base-64 encoding certs.jks.

        linux# base64 certs.jks > certs.jks.b64
        
      7. Inject and encrypt certs.jks.b64 into customizations.yaml.

        linux# cat <<EOF | yq w - 'data."certs.jks"' "$(<certs.jks.b64)" | \
        yq r -j - | /mnt/pitdata/prep/site-init/utils/secrets-encrypt.sh | \
        yq w -f - -i /mnt/pitdata/prep/site-init/customizations.yaml 'spec.kubernetes.sealed_secrets.cray-keycloak'
        {
          "kind": "Secret",
          "apiVersion": "v1",
          "metadata": {
            "name": "keycloak-certs",
            "namespace": "services",
            "creationTimestamp": null
          },
          "data": {}
        }
        EOF
        
    3. Update the keycloak_users_localize sealed secret with the appropriate value for ldap_connection_url.

      1. Set ldap_connection_url in customizations.yaml.

        For example:

        linux# yq write -i /mnt/pitdata/prep/site-init/customizations.yaml \
        'spec.kubernetes.sealed_secrets.keycloak_users_localize.generate.data.(args.name==ldap_connection_url).args.value' "ldaps://$LDAP"
        
      2. Review the keycloak_users_localize sealed secret.

        linux# yq read /mnt/pitdata/prep/site-init/customizations.yaml spec.kubernetes.sealed_secrets.keycloak_users_localize
        

        Expected output is similar to:

      generate:
          name: keycloak-users-localize
          data:
          - type: static
              args:
              name: ldap_connection_url
              value: ldaps://dcldap2.us.cray.com
      
    4. Configure the ldapSearchBase and localRoleAssignments settings for the cray-keycloak-users-localize chart in customizations.yaml.

      There may be one or more groups in LDAP for administrators and one or more for users. Each admin group needs to be assigned to role admin and set to both shasta and cray clients in Keycloak. Each user group needs to be assigned to role user and set to both shasta and cray clients in Keycloak.

      1. Set ldapSearchBase in customizations.yaml.

        This example sets ldapSearchBase to dc=dcldap,dc=dit

        linux# yq write -i /mnt/pitdata/prep/site-init/customizations.yaml spec.kubernetes.services.cray-keycloak-users-localize.ldapSearchBase 'dc=dcldap,dc=dit'
        
      2. Set localRoleAssignments in customizations.yaml.

        This example sets localRoleAssignments for the LDAP groups employee, craydev, and shasta_admins to be the admin role and the LDAP group shasta_users to be the user role.

        linux# yq write -s - -i /mnt/pitdata/prep/site-init/customizations.yaml <<EOF
        - command: update
          path: spec.kubernetes.services.cray-keycloak-users-localize.localRoleAssignments
          value:
          - {"group": "employee", "role": "admin", "client": "shasta"}
          - {"group": "employee", "role": "admin", "client": "cray"}
          - {"group": "craydev", "role": "admin", "client": "shasta"}
          - {"group": "craydev", "role": "admin", "client": "cray"}
          - {"group": "shasta_admins", "role": "admin", "client": "shasta"}
          - {"group": "shasta_admins", "role": "admin", "client": "cray"}
          - {"group": "shasta_users", "role": "user", "client": "shasta"}
          - {"group": "shasta_users", "role": "user", "client": "cray"}
        EOF
        
      3. Review the cray-keycloak-users-localize values.

        linux# yq read /mnt/pitdata/prep/site-init/customizations.yaml spec.kubernetes.services.cray-keycloak-users-localize
        

        Expected output looks similar to:

        sealedSecrets:
            - '{{ kubernetes.sealed_secrets.keycloak_users_localize | toYaml }}'
        localRoleAssignments:
            - {"group": "employee", "role": "admin", "client": "shasta"}
            - {"group": "employee", "role": "admin", "client": "cray"}
            - {"group": "craydev", "role": "admin", "client": "shasta"}
            - {"group": "craydev", "role": "admin", "client": "cray"}
            - {"group": "shasta_admins", "role": "admin", "client": "shasta"}
            - {"group": "shasta_admins", "role": "admin", "client": "cray"}
            - {"group": "shasta_users", "role": "user", "client": "shasta"}
            - {"group": "shasta_users", "role": "user", "client": "cray"}
        ldapSearchBase: dc=dcldap,dc=dit
        
  9. If needing to resolve outside hostnames, forwarding must be configured in the cray-dns-unbound service. For example, if using a hostname and not an IP address for the upstream LDAP server in step 4 above, it is necessary to be able to resolve that hostname.

    1. Set the forwardZones for the cray-dns-unbound service.

      linux# yq write -s - -i /mnt/pitdata/prep/site-init/customizations.yaml <<EOF
      - command: update
        path: spec.kubernetes.services.cray-dns-unbound
        value:
          forwardZones:
          - name: "."
            forwardIps:
            - "{{ network.netstaticips.system_to_site_lookups }}"
      EOF
      
    2. On success, review the cray-dns-unbound values.

      linux# yq read /mnt/pitdata/prep/site-init/customizations.yaml spec.kubernetes.services.cray-dns-unbound
      

      Expected output looks similar to:

      forwardZones:
      - name: "."
        forwardIps:
        - "{{ network.netstaticips.system_to_site_lookups }}"
      

4. Generate Sealed Secrets

Secrets are stored in customizations.yaml as SealedSecret resources (i.e., encrypted secrets) which are deployed by specific charts and decrypted by the Sealed Secrets operator. But first, those secrets must be seeded generated and encrypted.

  1. Load the zeromq container image required by Sealed Secret Generators.

    NOTE Requires a properly configured Docker or Podman environment.

    If this step fails with an “overlay is not supported” error, then run the following command and then re-run the load-container-image.sh script:

    sed -i 's/skopeo_dest=.*/skopeo_dest="${transport}:${image}"/' /mnt/pitdata/${CSM_RELEASE}/hack/load-container-image.sh

    linux# /mnt/pitdata/${CSM_RELEASE}/hack/load-container-image.sh dtr.dev.cray.com/zeromq/zeromq:v4.0.5
    

    IMPORTANT: CSM 1.0.11 shipped a version of shasta-cfg that requires a different image tag for the zeromq image. If installing this release, see Incorrectly Tagged zeromq Image for instructions on re-tagging the image before proceeding.

  2. Re-encrypt existing secrets.

    linux# /mnt/pitdata/prep/site-init/utils/secrets-reencrypt.sh /mnt/pitdata/prep/site-init/customizations.yaml \
    /mnt/pitdata/prep/site-init/certs/sealed_secrets.key /mnt/pitdata/prep/site-init/certs/sealed_secrets.crt
    

    It is not an error if this script gives no output.

  3. Generate secrets.

    linux# /mnt/pitdata/prep/site-init/utils/secrets-seed-customizations.sh \
    /mnt/pitdata/prep/site-init/customizations.yaml
    

    Expected output looks similar to:

    Creating Sealed Secret keycloak-certs
    Generating type static_b64...
    Creating Sealed Secret keycloak-master-admin-auth
    Generating type static...
    Generating type static...
    Generating type randstr...
    Generating type static...
    Creating Sealed Secret cray_reds_credentials
    Generating type static...
    Generating type static...
    Creating Sealed Secret cray_meds_credentials
    Generating type static...
    Creating Sealed Secret cray_hms_rts_credentials
    Generating type static...
    Generating type static...
    Creating Sealed Secret vcs-user-credentials
    Generating type randstr...
    Generating type static...
    Creating Sealed Secret generated-platform-ca-1
    Generating type platform_ca...
    Creating Sealed Secret pals-config
    Generating type zmq_curve...
    Generating type zmq_curve...
    Creating Sealed Secret munge-secret
    Generating type randstr...
    Creating Sealed Secret slurmdb-secret
    Generating type static...
    Generating type static...
    Generating type randstr...
    Generating type randstr...
    Creating Sealed Secret keycloak-users-localize
    Generating type static...
    

5. Version Control Site-Init Files

Setup /mnt/pitdata/prep/site-init as a Git repository in order to manage the baseline configuration during initial system installation.

  1. Initialize /mnt/pitdata/prep/site-init as a Git repository.

    linux# cd /mnt/pitdata/prep/site-init
    linux# git init .
    
  2. (Optional) WARNING If production system or operational security is a concern, do NOT store the sealed secret private key in Git; instead, store the sealed secret key outside of Git in a secure offline system. To ensure these sensitive keys are not accidentally committed, configure .gitignore to ignore files under the certs directory:

    linux# echo "certs/" >> .gitignore
    
  3. Stage site-init files to be committed.

    linux# git add -A
    
  4. Review what will be committed.

    linux# git status
    
  5. Commit all the above changes as the baseline configuration.

    linux# git commit -m "Baseline configuration for $(/mnt/pitdata/${CSM_RELEASE}/lib/version.sh)"
    

5.1 Push to a Remote Repository

It is strongly recommended that the site-init repository be maintained off-cluster. Add a remote repository and push the baseline configuration on master branch to a corresponding remote branch.

6. Patch cloud-init with the CA

NOTE Only execute this procedure if you booted from the remote ISO using procedure Bootstrap LiveCD Remote ISO). Skip this if using a USB LiveCD. These steps are done elsewhere in that procedure.

Using csi on a generated site-init directory:

  1. Patch the CA certificate from the shasta-cfg.

    pit# csi patch ca \
             --cloud-init-seed-file /var/www/ephemeral/configs/data.json \
             --customizations-file /var/www/ephemeral/prep/site-init/customizations.yaml \
             --sealed-secret-key-file /var/www/ephemeral/prep/site-init/certs/sealed_secrets.key
    
  2. Ensure that it picks up the new metadata.

    pit# systemctl restart basecamp
    
  3. Unmount the shim from earlier, if one was used.

    pit# umount -v /mnt/pitdata
    

7. Customer-Specific Customizations

Customer-specific customizations are any changes on top of the baseline configuration to satisfy customer-specific requirements. It is recommended that customer-specific customizations be tracked on branches separate from the mainline in order to make them easier to manage.

Apply any customer-specific customizations by merging the corresponding branches into master branch of /mnt/pitdata/prep/site-init.

When considering merges, and especially when resolving conflicts, carefully examine differences to ensure all changes are relevant. For example, when applying a customer-specific customization used in a prior version, be sure the change still makes sense. It is common for options to change as new features are introduced and bugs are fixed.