Non-compute node (NCN) personalization applies post-boot configuration to the HPE Cray EX management nodes. Several HPE Cray EX product environments outside of CSM require NCN personalization to function. Consult the manual for each product to configure them on NCNs by referring to the HPE Cray EX System Software Getting Started Guide S-8000 on the HPE Customer Support Center.
This procedure defines the NCN personalization process for the CSM product using the Configuration Framework Service (CFS).
During a fresh install, carry out these procedures in order. Later, individual procedures may be re-run as needed. These procedures are not done as part of CSM upgrades. In particular, during upgrades, different procedures are provided that handle management NCN personalization. Review the upgrade documentation for more information.
root
password and SSH keys in Vault
This procedure should be run during CSM installation and any later time when the SSH keys need to be changed per site requirements.
The goal of passwordless SSH is to enable an easy way for interactive passwordless SSH from and between CSM product environments (management nodes) to downstream managed product environments (COS, UAN, etc), without requiring each downstream environment to create and apply individual changes to NCNs, and as a primary way to manage passwordless SSH configuration between management nodes. Passwordless SSH from downstream nodes into CSM management nodes is not intended or supported.
Passwordless SSH keypairs for the Cray System Management (CSM) are created
automatically and maintained with a Kubernetes deployment and staged into
Kubernetes secrets (csm-private-key
) and ConfigMaps (csm-public-key
) in the
services
namespace. Administrators can use these provided keys, provide their
own keys, or use their own solution for authentication.
The management of keys on NCNs is achieved by the trust-csm-ssh-keys
and
passwordless-ssh
Ansible roles in the CSM configuration management repository.
The SSH keypair is applied to management nodes using NCN personalization.
NOTE
CFS itself does not use the CSM-provided (or user-supplied) SSH keys to make connections between nodes. CFS will continue to function if passwordless SSH is disabled between CSM and other product environments.
The default CSM Ansible plays are already configured to enable Passwordless SSH
by default. No further action is necessary before proceeding to
Configure the root
password and SSH keys in Vault.
(ncn-mw#
) Administrators may elect to replace the CSM-provided keys with their own custom
keys.
Set variables to the locations of the public and private SSH key files.
Replace the values in the examples below with the paths to the desired key files on the system.
PUBLIC_KEY_FILE=/path/to/id_rsa-csm.pub
PRIVATE_KEY_FILE=/path/to/id_rsa-csm
Provide the custom keys by script or manually.
There are two options for providing the keys.
Provide custom SSH keys by script.
The replace_ssh_keys.sh
script can be used to replace the keys from files.
The
docs-csm
RPM must be installed in order to use this script. See Check for Latest Documentation
/usr/share/doc/csm/scripts/operations/configuration/replace_ssh_keys.sh \
--public-key-file "${PUBLIC_KEY_FILE}" --private-key-file "${PRIVATE_KEY_FILE}"
Manually provide custom SSH keys
The keys stored in Kubernetes can be updated directly.
Replace the private key half:
KEY64=$(cat "${PRIVATE_KEY_FILE}" | base64) &&
kubectl get secret -n services csm-private-key -o json | \
jq --arg value "$KEY64" '.data["value"]=$value' | kubectl apply -f - &&
unset KEY64
Replace the public key half:
kubectl delete configmap -n services csm-public-key &&
cat "${PUBLIC_KEY_FILE}" | base64 > ./value &&
kubectl create configmap --from-file value csm-public-key --namespace services &&
rm ./value
Passwordless SSH with the provided keys will be set up once NCN personalization runs on the NCNs.
NOTE
: This keypair may be the same keypair used for the NCNroot
user, but it is not required to be the same. Either option is valid.
Proceed Configure the root
password and SSH keys in Vault.
Local site security requirements may preclude use of passwordless SSH access between management nodes. A variable has been added to the associated Ansible roles that allows disabling of passwordless SSH setup to any or all nodes.
(ncn-mw#
) From the cloned csm-config-management
repository directory:
grep csm_passwordless_ssh_enabled roles/trust-csm-ssh-keys/defaults/main.yaml
Example output:
csm_passwordless_ssh_enabled: 'false'
This variable can be overwritten using either a host-specific setting or global
to affect
all nodes where the playbook is run. See
Customize Configuration Values
for more detailed information. Do not modify the value in the roles/trust-csm-ssh-keys/defaults/main.yaml
file.
Published roles within product configuration repositories can contain more comprehensive
information regarding these role-specific flags. Reference any role-specific associated Readme.md
documents for additional information, because role documentation is updated more frequently as
changes are introduced.
Consult the manual for each product in order to change the default configuration by referring to the HPE Cray EX System Software Getting Started Guide S-8000 on the HPE Customer Support Center. Similar configuration values for disabling the role will be required in these product-specific configuration repositories.
Modifying Ansible plays in a configuration repository will require a new commit and subsequent update of the configuration layer associated with the product.
Proceed Configure the root
password and SSH keys in Vault.
Use this procedure if switching from custom keys to the default CSM SSH keys only; otherwise it should be skipped.
(ncn-mw#
) In order to restore the default CSM keys, there are two options:
Restore by script.
The
docs-csm
RPM must be installed in order to use this script. See Check for Latest Documentation
/usr/share/doc/csm/scripts/operations/configuration/restore_ssh_keys.sh
Restore manually.
The keys can be deleted from Kubernetes directly. The csm-ssh-keys
Kubernetes deployment
provided by CSM periodically checks the ConfigMap and secret containing the key information.
If these entries do not exist, it will recreate them from the default CSM keys. Therefore, in
order to manually restore the keys, delete the associated ConfigMap and secret. The default CSM-provided
keys will be republished.
Delete the csm-private-key
Kubernetes secret.
kubectl delete secret -n services csm-private-key
Delete the csm-public-key
Kubernetes ConfigMap.
kubectl delete configmap -n services csm-public-key
root
password and SSH keys in VaultThe root
user password and SSH keys are managed on NCNs by using the
csm.password
and csm.ssh_keys
Ansible roles, respectively, located in the
CSM configuration management repository. root
user passwords and SSH keys are
set and managed in Vault.
There are two options for setting the root
password and SSH keys in Vault:
automated default or manual.
After these have been set in Vault, they will automatically be applied to NCNs during NCN personalization. For more information on how to configure and run NCN personalization, see 3. Perform management NCN personalization.
The automated default method uses the write_root_secrets_to_vault.py
script to read in the current
root
user password and SSH keys from the NCN where it is run, and write those to Vault. All of the NCNs are
booted from images which already had their root
passwords and SSH keys customized during the
Deploy Management Nodes
procedure of the CSM install. In most cases, these are the same password and keys that should be
written to Vault, and this script provides an easy way to do that.
Specifically, the write_root_secrets_to_vault.py
script reads the following from the NCN where it is run:
root
user password hash from the /etc/shadow
file./root/.ssh/id_rsa
./root/.ssh/id_rsa.pub
.This script can be run on any Kubernetes management NCN (master or worker). It only needs to be run once for the cluster, because the same Vault credentials are used for all management NCNs.
The
docs-csm
RPM must be installed in order to use this script. See Check for Latest Documentation
(ncn-mw#
) Run the script with the following command:
/usr/share/doc/csm/scripts/operations/configuration/write_root_secrets_to_vault.py
A successful execution will exit with return code 0 and will have output similar to the following:
Reading in file '/root/.ssh/id_rsa'
Reading in file '/root/.ssh/id_rsa.pub'
Reading in file '/etc/shadow'
Found root user line in /etc/shadow
Initializing Kubernetes client
Getting Vault token from vault/cray-vault-unseal-keys Kubernetes secret
Examining Kubernetes cray-vault service to determine URL for Vault API endpoint of secret/csm/users/root
Writing SSH keys and root password hash to secret/csm/users/root in Vault
Making POST request to http://10.18.232.40:8200/v1/secret/csm/users/root
Response status code = 204
Read back secrets from Vault to verify that the values were correctly saved
Making GET request to http://10.18.232.40:8200/v1/secret/csm/users/root
Response status code = 200
Validating that Vault contents match what was written to it
All secrets successfully written to Vault
SUCCESS
Proceed to 3. Perform management NCN personalization.
NOTE
: Information on writing theroot
user password and the SSH keys to Vault is documented in two separate procedures. However, if both the password and the SSH keys are to be stored in Vault (the standard case), then the two procedures must be combined. Specifically, only a singlewrite
command must be made to Vault, containing both the password and the SSH keys. If multiplewrite
commands are performed, only the information from the final command will persist.
Set the root
user password and SSH keys in Vault by combining the following two procedures:
Configure Root Password in Vault
procedure in Update NCN User Passwords.Configure Root SSH Keys in Vault
procedure in Update NCN User SSH Keys.Proceed to 3. Perform management NCN personalization.
The previous procedures on this page did not make any changes to the NCNs on the system. They merely specified how the NCNs should be configured. In order for these configurations to be applied to the NCNs, a process called NCN personalization takes place using CFS.
There are two steps:
These can be accomplished by following Option 1: Automatic configuration or Option 2: Manual configuration.
This option uses a script to create or update the NCN personalization configuration in CFS, and then to apply the configuration to the NCNs.
The
docs-csm
RPM must be installed in order to use this script. See Check for Latest Documentation.
The options used with the script depend on the context in which this procedure is being followed.
In either case, although it is not usually required, it is possible to further customize the behavior of the script. For more information, see:
At this point in the fresh install process, the only product installed on the system is CSM. Therefore the CFS configuration that the script creates for management NCN personalization will only contain the CSM layer.
(ncn-mw#
) When performing this procedure during a CSM fresh install, the recommended command to run is the following:
/usr/share/doc/csm/scripts/operations/configuration/apply_csm_configuration.sh
(ncn-mw#
) In order to force NCN personalization to run on all of the management NCNs, run the following command:
/usr/share/doc/csm/scripts/operations/configuration/apply_csm_configuration.sh --no-config-change --clear-state
The script has two basic modes of operation:
In the default mode (which can be explicitly specified by using the
--config-change
parameter), the script will create or update a CFS configuration and then apply it to the management NCNs.
In this mode, the script performs the following steps:
csm-config-management
repository.Management
role.ncn-personalization
configuration in CFS to contain only the CSM layer for the latest installed CSM release.ncn-personalization
.In its other mode (specified by using the --no-config-change
), the script applies an existing CFS configuration to
the management NCNs. In this mode, the script performs the following steps:
Management
role.ncn-personalization
.(ncn-mw#
) View a usage message for the script, documenting all of its parameters, by running the following command:
/usr/share/doc/csm/scripts/operations/configuration/apply_csm_configuration.sh -h
The script also supports several flags to override the default behaviors described previously.
Some of these flags apply to both modes of operation, whereas others apply only to the
--config-change
mode.
--clear-state
: Clears existing state from components to ensure that CFS runs, even if no
changes have been made to the content of the CFS configuration.--config-name
: For --config-change
mode, this is the name of the CFS configuration that
is created or updated. For --no-config-change
mode, this is the name of the existing CFS configuration
to use on the management NCNs. In either case, it defaults to ncn-personalization
.--no-clear-err
: By default, the error count is cleared on all of the management NCN CFS components. If this
parameter is specified, the error count is not cleared.--xnames
: A comma-separated list of component names (xnames) to deploy to. Defaults to all
Management
nodes in HSM.--config-change
parameters
--csm-release
: Overrides the version of the CSM release that is used.
(ncn-mw#
) Available versions can be found in the cray-product-catalog
ConfigMap.
kubectl -n services get cm cray-product-catalog
If not specified, the script chooses the latest version found in the above ConfigMap.
--csm-config-version
: Overrides the version of the CSM configuration. This corresponds
to the version of a branch starting with cray/csm/
in the csm
repository in VCS.
--git-commit
: Overrides the Git commit cloned for the configuration content.
Defaults to the latest commit on the csm-release
branch.
--git-clone-url
: Overrides the source of the configuration content. Defaults
to https://api-gw-service-nmn.local/vcs/cray/csm-config-management.git
.
--ncn-config-file
: This argument is used in order to base the new CFS configuration
on the specified file, rather than creating a new one that contains only the CSM layer.
If a file is specified, then the new CFS configuration made by the script will have the
contents of the specified file, except that the CSM layer will be added or updated to
use the new Git commit.
This is only recommended for experienced administrators with specific reasons not to use the automatic procedure.
In order to manually run management NCN personalization, first gather the following information:
Field | Value | Description |
---|---|---|
cloneUrl |
https://api-gw-service-nmn.local/vcs/cray/csm-config-management.git |
CSM configuration repository |
commit |
Example: 5081c1ecea56002df41218ee39f6030c3eebdf27 |
CSM configuration commit hash |
name |
Example: csm-1.9.21 |
CSM configuration layer name |
playbook |
site.yml |
Default site-wide Ansible playbook for CSM |
Retrieve the commit in the repository to use for configuration. If changes have been made to the default branch that was imported during a CSM installation or upgrade, use the commit containing the changes.
If no changes have been made, then the latest commit on the default branch for this version of CSM should be used.
(ncn-mw#
) Find the commit in the cray-product-catalog
ConfigMap for the current version of CSM. For example:
kubectl -n services get cm cray-product-catalog -o jsonpath='{.data.csm}'
Look for something similar to the following in the output:
1.2.0:
configuration:
clone_url: https://vcs.cmn.SYSTEM_DOMAIN_NAME/vcs/cray/csm-config-management.git
commit: 43ecfa8236bed625b54325ebb70916f55884b3a4
import_branch: cray/csm/1.9.24
import_date: 2021-07-28 03:26:01.869501
ssh_url: git@vcs.cmn.SYSTEM_DOMAIN_NAME:cray/csm-config-management.git
The commit will be different for each system and version of CSM. For
this example, it is 43ecfa8236bed625b54325ebb70916f55884b3a4
.
Craft a new configuration layer entry for the new CSM.
{
"name": "csm-<version>",
"cloneUrl": "https://api-gw-service-nmn.local/vcs/cray/csm-config-management.git",
"playbook": "site.yml",
"commit": "<retrieved git commit ID>"
}
Create or update the NCN personalization configuration in CFS.
NOTE
The CSM configuration layer MUST be the first layer in the NCN personalization CFS configuration.
Follow the option that applies to the current situation:
If a CSM fresh install is being performed, then create a management NCN personalization CFS configuration.
Create a CFS configuration whose only layer is the CSM configuration layer, using the JSON from the previous step. See Perform NCN Personalization.
Otherwise, update the existing management NCN personalization CFS configuration, replacing the existing CSM configuration layer with the JSON from the previous step. See Perform NCN Personalization.