Non-compute node (NCN) personalization applies post-boot configuration to the
HPE Cray EX management nodes. Several HPE Cray EX product environments outside
of CSM require NCN personalization to function. Consult the manual for each
product to configure them on NCNs by referring to the
HPE Cray EX System Software Getting Started Guide (S-8000) 22.07
on the HPE Customer Support Center.
This procedure defines the NCN personalization process for the CSM product using the Configuration Framework Service (CFS).
During a fresh install, carry out these procedures in order. Later, individual procedures may be re-run as needed.
root
password and SSH keys in Vault
This procedure should be run during CSM installation and any later time when the SSH keys need to be changed per site requirements.
The goal of passwordless SSH is to enable an easy way for interactive passwordless SSH from and between CSM product environments (management nodes) to downstream managed product environments (COS, UAN, etc), without requiring each downstream environment to create and apply individual changes to NCNs, and as a primary way to manage passwordless SSH configuration between management nodes. Passwordless SSH from downstream nodes into CSM management nodes is not intended or supported.
Passwordless SSH keypairs for the Cray System Management (CSM) are created
automatically and maintained with a Kubernetes deployment and staged into
Kubernetes secrets (csm-private-key
) and ConfigMaps (csm-public-key
) in the
services
namespace. Administrators can use these provided keys, provide their
own keys, or use their own solution for authentication.
The management of keys on NCNs is achieved by the trust-csm-ssh-keys
and
passwordless-ssh
Ansible roles in the CSM configuration management repository.
The SSH keypair is applied to management nodes using NCN personalization.
NOTE: CFS itself does not use the CSM-provided (or user-supplied) SSH keys to make connections between nodes. CFS will continue to function if passwordless SSH is disabled between CSM and other product environments.
The default CSM Ansible plays are already configured to enable Passwordless SSH
by default. No further action is necessary before proceeding to
Configure the root
password and SSH keys in Vault.
Administrators may elect to replace the CSM-provided keys with their own custom keys.
Set variables to the locations of the public and private SSH key files.
Replace the values in the examples below with the paths to the desired key files on the system.
ncn-mw# PUBLIC_KEY_FILE=/path/to/id_rsa-csm.pub
ncn-mw# PRIVATE_KEY_FILE=/path/to/id_rsa-csm
Provide the custom keys by script or manually.
There are two options for providing the keys.
Provide custom SSH keys by script.
The replace_ssh_keys.sh
script can be used to replace the keys from files.
The
docs-csm
RPM must be installed in order to use this script. See Check for Latest Documentation
ncn-mw# /usr/share/doc/csm/scripts/operations/configuration/replace_ssh_keys.sh \
--public-key-file "${PUBLIC_KEY_FILE}" --private-key-file "${PRIVATE_KEY_FILE}"
Manually provide custom SSH keys
The keys stored in Kubernetes can be updated directly.
Replace the private key half:
ncn-mw# KEY64=$(cat "${PRIVATE_KEY_FILE}" | base64) &&
kubectl get secret -n services csm-private-key -o json | \
jq --arg value "$KEY64" '.data["value"]=$value' | kubectl apply -f - &&
unset KEY64
Replace the public key half:
ncn-mw# kubectl delete configmap -n services csm-public-key &&
cat "${PUBLIC_KEY_FILE}" | base64 > ./value &&
kubectl create configmap --from-file value csm-public-key --namespace services &&
rm ./value
Passwordless SSH with the provided keys will be set up once NCN personalization runs on the NCNs.
NOTE: This keypair may be the same keypair used for the NCN root
user, but it
is not required to be the same. Either option is valid.
Proceed Configure the root
password and SSH keys in Vault.
Local site security requirements may preclude use of passwordless SSH access between
management nodes. A variable has been added to the associated Ansible roles that
allows disabling of passwordless SSH setup to any or all nodes. From the cloned
csm-config-management
repository directory:
ncn-mw# grep csm_passwordless_ssh_enabled roles/trust-csm-ssh-keys/defaults/main.yaml
Example output:
csm_passwordless_ssh_enabled: 'false'
This variable can be overwritten using either a host-specific setting or global
to affect
all nodes where the playbook is run. See
Customize Configuration Values
for more detailed information. Do not modify the value in the roles/trust-csm-ssh-keys/defaults/main.yaml
file.
Published roles within product configuration repositories can contain more comprehensive
information regarding these role-specific flags. Reference any role-specific associated Readme.md
documents for additional information, because role documentation is updated more frequently as
changes are introduced.
Consult the manual for each product in order to change the default configuration by
referring to the HPE Cray EX System Software Getting Started Guide (S-8000) 22.07
on the HPE Customer Support Center. Similar configuration values for disabling the
role will be required in these product-specific configuration repositories.
Modifying Ansible plays in a configuration repository will require a new commit and subsequent update of the configuration layer associated with the product.
Proceed Configure the root
password and SSH keys in Vault.
Use this procedure if switching from custom keys to the default CSM SSH keys only; otherwise it should be skipped.
In order to restore the default CSM keys, there are two options:
Restore by script.
The
docs-csm
RPM must be installed in order to use this script. See Check for Latest Documentation
ncn-mw# /usr/share/doc/csm/scripts/operations/configuration/restore_ssh_keys.sh
Restore manually.
The keys can be deleted from Kubernetes directly. The csm-ssh-keys
Kubernetes deployment
provided by CSM periodically checks the ConfigMap and secret containing the key information.
If these entries do not exist, it will recreate them from the default CSM keys. Therefore, in
order to manually restore the keys, delete the associated ConfigMap and secret. The default CSM-provided
keys will be republished.
Delete the csm-private-key
Kubernetes secret.
ncn-mw# kubectl delete secret -n services csm-private-key
Delete the csm-public-key
Kubernetes ConfigMap.
ncn-mw# kubectl delete configmap -n services csm-public-key
root
password and SSH keys in VaultThe root
user password and SSH keys are managed on NCNs by using the
csm.password
and csm.ssh_keys
Ansible roles, respectively, located in the
CSM configuration management repository. root
user passwords and SSH keys are
set and managed in Vault.
There are two options for setting the root
password and SSH keys in Vault:
automated default or manual.
After these have been set in Vault, they will automatically be applied to NCNs during NCN personalization. For more information on how to configure and run NCN personalization, see 3. Perform management NCN personalization.
The automated default method uses the write_root_secrets_to_vault.py
script to read in the current
root
user password and SSH keys from the NCN where it is run, and write those to Vault. All of the NCNs are
booted from images which already had their root
passwords and SSH keys customized during the
Deploy Management Nodes
procedure of the CSM install. In most cases, these are the same password and keys that should be
written to Vault, and this script provides an easy way to do that.
Specifically, the write_root_secrets_to_vault.py
script reads the following from the NCN where it is run:
root
user password hash from the /etc/shadow
file./root/.ssh/id_rsa
./root/.ssh/id_rsa.pub
.This script can be run on any Kubernetes management NCN (master or worker). It only needs to be run once for the cluster, because the same Vault credentials are used for all management NCNs.
The
docs-csm
RPM must be installed in order to use this script. See Check for Latest Documentation
Run the script with the following command:
ncn-mw# /usr/share/doc/csm/scripts/operations/configuration/write_root_secrets_to_vault.py
A successful execution will exit with return code 0 and will have output similar to the following:
Reading in file '/root/.ssh/id_rsa'
Reading in file '/root/.ssh/id_rsa.pub'
Reading in file '/etc/shadow'
Found root user line in /etc/shadow
Initializing Kubernetes client
Getting Vault token from vault/cray-vault-unseal-keys Kubernetes secret
Examining Kubernetes cray-vault service to determine URL for Vault API endpoint of secret/csm/users/root
Writing SSH keys and root password hash to secret/csm/users/root in Vault
Making POST request to http://10.18.232.40:8200/v1/secret/csm/users/root
Response status code = 204
Read back secrets from Vault to verify that the values were correctly saved
Making GET request to http://10.18.232.40:8200/v1/secret/csm/users/root
Response status code = 200
Validating that Vault contents match what was written to it
All secrets successfully written to Vault
SUCCESS
Proceed to 3. Perform management NCN personalization.
NOTE: Information on writing the
root
user password and the SSH keys to Vault is documented in two separate procedures. However, if both the password and the SSH keys are to be stored in Vault (the standard case), then the two procedures must be combined. Specifically, only a singlewrite
command must be made to Vault, containing both the password and the SSH keys. If multiplewrite
commands are performed, only the information from the final command will persist.
Set the root
user password and SSH keys in Vault by combining the following two procedures:
Configure Root Password in Vault
procedure in Update NCN User Passwords.Configure Root SSH Keys in Vault
procedure in Update NCN User SSH Keys.Proceed to 3. Perform management NCN personalization.
The previous procedures on this page did not make any changes to the NCNs on the system. They merely specified how the NCNs should be configured. In order for these configurations to be applied to the NCNs, a process called NCN personalization takes place using CFS.
There are two steps:
These can be accomplished by following Option 1: Automatically apply CSM configuration or Option 2: Manually apply CSM configuration.
The
docs-csm
RPM must be installed in order to use this script. See Check for Latest Documentation
By default the script will select the latest available CSM release. However,
it is possible to specify the CSM release version using the --csm-release
parameter.
ncn-mw# /usr/share/doc/csm/scripts/operations/configuration/apply_csm_configuration.sh --csm-release <version e.g. 1.0.11>
By default, the script will perform the following steps:
csm-config-management
repository.ncn-personalization.json
configuration file.Management
role.ncn-personalization
configuration in CFS from the ncn-personalization.json
file.ncn-personalization
.The script also supports several flags to override these behaviors:
--csm-release
: Overrides the version of the CSM release that is used. Defaults
to the latest version.
Available versions can be found in the cray-product-catalog
ConfigMap.
ncn-mw# kubectl -n services get cm cray-product-catalog
--csm-config-version
: Overrides the version of the CSM configuration. This corresponds
to the version of a branch starting with cray/csm/
in the csm
repository in VCS.
--git-commit
: Overrides the Git commit cloned for the configuration content.
Defaults to the latest commit on the csm-release
branch.
--git-clone-url
: Overrides the source of the configuration content. Defaults
to https://api-gw-service-nmn.local/vcs/cray/csm-config-management.git
.
--ncn-config-file
: Sets a file other than ncn-personalization.json
to be
used for the configuration.
--xnames
: A comma-separated list of component names (xnames) to deploy to. Defaults to all
Management
nodes in HSM.
--clear-state
: Clears existing state from components to ensure that CFS runs. This
can be used if configuration needs to be re-run on successful nodes with no
change to the Git content since the previous run; for example, if the SSH
keys have changed.
In order to manually run management NCN personalization, first gather the following information:
Field | Value | Description |
---|---|---|
cloneUrl |
https://api-gw-service-nmn.local/vcs/cray/csm-config-management.git |
CSM configuration repository |
commit |
Example: 5081c1ecea56002df41218ee39f6030c3eebdf27 |
CSM configuration commit hash |
name |
Example: csm-1.9.21 |
CSM configuration layer name |
playbook |
site.yml |
Default site-wide Ansible playbook for CSM |
Retrieve the commit in the repository to use for configuration. If changes have been made to the default branch that was imported during a CSM installation or upgrade, use the commit containing the changes.
If no changes have been made, then the latest commit on the default branch for this version of CSM should be used.
Find the commit in the cray-product-catalog
ConfigMap for the current version of CSM. For example:
ncn-mw# kubectl -n services get cm cray-product-catalog -o jsonpath='{.data.csm}'
Look for something similar to the following in the output:
1.2.1:
configuration:
clone_url: https://vcs.cmn.SYSTEM_DOMAIN_NAME/vcs/cray/csm-config-management.git
commit: 43ecfa8236bed625b54325ebb70916f55884b3a4
import_branch: cray/csm/1.9.24
import_date: 2021-07-28 03:26:01.869501
ssh_url: git@vcs.cmn.SYSTEM_DOMAIN_NAME:cray/csm-config-management.git
The commit will be different for each system and version of CSM. For
this example, it is 43ecfa8236bed625b54325ebb70916f55884b3a4
.
Craft a new configuration layer entry for the new CSM.
{
"name": "csm-<version>",
"cloneUrl": "https://api-gw-service-nmn.local/vcs/cray/csm-config-management.git",
"playbook": "site.yml",
"commit": "<retrieved git commit ID>"
}
Create or update the NCN personalization configuration in CFS.
NOTE
The CSM configuration layer MUST be the first layer in the NCN personalization CFS configuration.