site init
These procedures guide administrators through setting up the site-init
directory which contains important customizations for various products.
site-init
DirectoryThe shasta-cfg
directory included in the CSM release tarball includes relatively static,
installation-centric artifacts, such as:
site-init
directory
NOTE
If the pre-installation is resuming here, ensure the environment variables have been properly set by following Set reusable environment variables and then coming back to this page.
(pit#
) Set the SITE_INIT
variable.
Important: All procedures on this page assume that
SITE_INIT
variable has been set.
SITE_INIT="${PITDATA}/prep/site-init"
(pit#
) Create the site-init
directory.
mkdir -pv "${SITE_INIT}"
(pit#
) Initialize site-init
from CSM.
"${CSM_PATH}/shasta-cfg/meta/init.sh" "${SITE_INIT}"
The following steps update ${SITE_INIT}/customizations.yaml
with system-specific customizations.
(pit#
) Change into the site-init
directory
cd "${SITE_INIT}"
(pit#
) Merge the system-specific settings generated by CSI into
customizations.yaml
.
yq merge -xP -i "${SITE_INIT}/customizations.yaml" <(yq prefix -P "${PITDATA}/prep/${SYSTEM_NAME}/customizations.yaml" spec)
(pit#
) Set the cluster name.
yq write -i "${SITE_INIT}/customizations.yaml" spec.wlm.cluster_name "${SYSTEM_NAME}"
(pit#
) Make a backup copy of ${SITE_INIT}/customizations.yaml
.
cp -pv "${SITE_INIT}/customizations.yaml" "${SITE_INIT}/customizations.yaml.prepassword"
(pit#
) Review the configuration to generate these sealed secrets in customizations.yaml
in the site-init
directory:
spec.kubernetes.sealed_secrets.cray_reds_credentials
spec.kubernetes.sealed_secrets.cray_meds_credentials
spec.kubernetes.sealed_secrets.cray_hms_rts_credentials
Username
and Password
references in match the existing settings of your system hardware components.
NOTE
- The
cray_reds_credentials
are used by the River Endpoint Discovery Service (REDS) for River components.- The
cray_meds_credentials
are used by the Mountain Endpoint Discovery Service (MEDS) for the liquid-cooled components in an Olympus (Mountain) cabinet.- The
cray_hms_rts_credentials
are used by the Redfish Translation Service (RTS) for any hardware components which are not managed by Redfish, such as a ServerTech PDU in a River Cabinet.See the
Decrypt Sealed Secrets for Review
section of Manage Sealed Secrets, if needing to examine credentials from prior installations.
vim "${SITE_INIT}/customizations.yaml"
(pit#
) Review the changes that you made.
diff ${SITE_INIT}/customizations.yaml ${SITE_INIT}/customizations.yaml.prepassword
(pit#
) Validate that REDS/MEDS/RTS credentials are correct.
For all credentials, make sure that Username
and Password
values are correct.
Validate REDS credentials:
NOTE
These credentials are used by the REDS and HMS discovery services, targeting River Redfish BMC endpoints and management switches
For
vault_redfish_defaults
, the only entry used is:{"Cray": {"Username": "root", "Password": "XXXX"}}
Ensure the
Cray
key exists. This key is not used in any of the other credential specifications.
yq read "${SITE_INIT}/customizations.yaml" 'spec.kubernetes.sealed_secrets.cray_reds_credentials.generate.data[*].args.value' | jq
Validate MEDS credentials:
These credentials are used by the MEDS service, targeting Redfish BMC endpoints.
yq read "${SITE_INIT}/customizations.yaml" 'spec.kubernetes.sealed_secrets.cray_meds_credentials.generate.data[0].args.value' | jq
Validate RTS credentials:
These credentials are used by the Redfish Translation Service, targeting River Redfish BMC endpoints and PDU controllers.
yq read "${SITE_INIT}/customizations.yaml" 'spec.kubernetes.sealed_secrets.cray_hms_rts_credentials.generate.data[*].args.value' | jq
To customize the PKI Certificate Authority (CA) used by the platform, see Certificate Authority.
IMPORTANT
The CA may not be modified after install.
NOTE
Skip past LDAP configuration to here if there is no LDAP configuration at this time. If LDAP should be enabled later, follow Add LDAP User Federation after installation.
(pit#
) Set environment variables for the LDAP server and its port.
In the following example, the LDAP server has the hostname dcldap2.hpc.amslabs.hpecorp.net
and is using the port 636.
LDAP=dcldap2.hpc.amslabs.hpecorp.net
PORT=636
(pit#
) Load the openjdk
container image.
NOTE
Requires a properly configured Docker or Podman environment.
"${CSM_PATH}/hack/load-container-image.sh" artifactory.algol60.net/csm-docker/stable/docker.io/library/openjdk:11-jre-slim
(pit#
) Get the issuer certificate.
Retrieve the issuer certificate for the LDAP server at port 636. Use openssl s_client
to connect
and show the certificate chain returned by the LDAP host:
openssl s_client -showcerts -connect "${LDAP}:${PORT}" </dev/null
Enter the issuer’s certificate into cacert.pem
.
Either manually extract (i.e., cut/paste) the issuer’s
certificate into cacert.pem
, or try the following commands to
create it automatically.
NOTE
The following commands were verified using OpenSSL version1.1.1d
and use the-nameopt RFC2253
option to ensure consistent formatting of distinguished names. Unfortunately, older versions of OpenSSL may not support-nameopt
on thes_client
command or may use a different default format. However, the issuer certificate can be manually extracted from the output of the aboveopenssl s_client
example, if the following commands are unsuccessful.
(pit#
) Observe the issuer’s DN.
openssl s_client -showcerts -nameopt RFC2253 -connect "${LDAP}:${PORT}" </dev/null 2>/dev/null | grep issuer= | sed -e 's/^issuer=//'
Expected output includes a line similar to one of the below examples:
Self-signed Certificate:
emailAddress=dcops@hpe.com,CN=Data Center,OU=HPC/MCS,O=HPE,ST=WI,C=US
Signed Certificate:
CN=DigiCert Global G2 TLS RSA SHA256 2020 CA1,O=DigiCert Inc,C=US
(pit#
) Extract the issuer’s certificate.
NOTE
The issuer DN is properly escaped as part of theawk
pattern below. It must be changed to match the value foremailAddress
,CN
,OU
, etc. for your LDAP. If the value you are using is different, be sure to escape it properly!
openssl s_client -showcerts -nameopt RFC2253 -connect "${LDAP}:${PORT}" </dev/null 2>/dev/null |
awk '/s:emailAddress=dcops@hpe.com,CN=Data Center,OU=HPC\/MCS,O=HPE,ST=WI,C=US/,/END CERTIFICATE/' |
awk '/BEGIN CERTIFICATE/,/END CERTIFICATE/' > cacert.pem
(pit#
) Create certs.jks
.
NOTE
The alias used in this command forcray-data-center-ca
should be changed to match your LDAP.
podman run --rm -v "$(pwd):/data" \
artifactory.algol60.net/csm-docker/stable/docker.io/library/openjdk:11-jre-slim keytool \
-importcert -trustcacerts -file /data/cacert.pem -alias cray-data-center-ca \
-keystore /data/certs.jks -storepass password -noprompt
NOTE
If the command is executed multiple times by oversight, then the console will display the following and may be ignored to proceed further.
keytool error: java.lang.Exception: Certificate not imported, alias <cray-data-center-ca> already exists
(pit#
) Create certs.jks.b64
by base-64 encoding certs.jks
.
base64 certs.jks > certs.jks.b64
(pit#
) Inject and encrypt certs.jks.b64
into customizations.yaml
.
cat <<EOF | yq w - 'data."certs.jks"' "$(<certs.jks.b64)" | \
yq r -j - | ${SITE_INIT}/utils/secrets-encrypt.sh | \
yq w -f - -i ${SITE_INIT}/customizations.yaml 'spec.kubernetes.sealed_secrets.cray-keycloak'
{
"kind": "Secret",
"apiVersion": "v1",
"metadata": {
"name": "keycloak-certs",
"namespace": "services",
"creationTimestamp": null
},
"data": {}
}
EOF
(pit#
) Update the keycloak_users_localize
sealed secret with the
appropriate value for ldap_connection_url
.
(pit#
) Set ldap_connection_url
in customizations.yaml
.
yq write -i "${SITE_INIT}/customizations.yaml" \
'spec.kubernetes.sealed_secrets.keycloak_users_localize.generate.data.(args.name==ldap_connection_url).args.value' \
"ldaps://${LDAP}"
(pit#
) Review the keycloak_users_localize
sealed secret.
yq read "${SITE_INIT}/customizations.yaml" spec.kubernetes.sealed_secrets.keycloak_users_localize
Configure the ldapSearchBase
and localRoleAssignments
settings for
the cray-keycloak-users-localize
chart in customizations.yaml
.
NOTE
There may be one or more groups in LDAP for admins and one or more for users. Each admin group needs to be assigned to roleadmin
and set to bothshasta
andcray
clients in Keycloak. Each user group needs to be assigned to roleuser
and set to bothshasta
andcray
clients in Keycloak.
(pit#
) Set ldapSearchBase
in customizations.yaml
.
NOTE
This example setsldapSearchBase
todc=dcldap,dc=dit
yq write -i "${SITE_INIT}/customizations.yaml" spec.kubernetes.services.cray-keycloak-users-localize.ldapSearchBase 'dc=dcldap,dc=dit'
(pit#
) Set localRoleAssignments
in customizations.yaml
.
NOTE
This example setslocalRoleAssignments
for the LDAP groupsemployee
,craydev
, andshasta_admins
to be theadmin
role, and the LDAP groupshasta_users
to be theuser
role.
yq write -s - -i "${SITE_INIT}/customizations.yaml" <<EOF
- command: update
path: spec.kubernetes.services.cray-keycloak-users-localize.localRoleAssignments
value:
- {"group": "employee", "role": "admin", "client": "shasta"}
- {"group": "employee", "role": "admin", "client": "cray"}
- {"group": "craydev", "role": "admin", "client": "shasta"}
- {"group": "craydev", "role": "admin", "client": "cray"}
- {"group": "shasta_admins", "role": "admin", "client": "shasta"}
- {"group": "shasta_admins", "role": "admin", "client": "cray"}
- {"group": "shasta_users", "role": "user", "client": "shasta"}
- {"group": "shasta_users", "role": "user", "client": "cray"}
EOF
(pit#
) Review the cray-keycloak-users-localize
values.
yq read "${SITE_INIT}/customizations.yaml" spec.kubernetes.services.cray-keycloak-users-localize
(pit#
) Configure the Unbound DNS resolver (if needed).
Important If access to a site DNS server is required and this DNS server was specified to
csi
using thesite-dns
option (either on the command line or in thesystem_config.yaml
file), then no further action is required and this step should be skipped.
The default configuration is as follows:
cray-dns-unbound:
domain_name: '{{ network.dns.external }}'
forwardZones:
- name: "."
forwardIps:
- "{{ network.netstaticips.system_to_site_lookups }}"
The configured site DNS server can be verified by inspecting the value set for system_to_site_lookups
.
yq r ${SITE_INIT}/customizations.yaml spec.network.netstaticips.system_to_site_lookups
Possible output:
172.30.84.40
If there is no requirement to resolve external hostnames (including other services on the site network) or no upstream DNS server,
then the cray-dns-unbound
service should be configured to forward to the cray-dns-powerdns
service.
(pit#
) Update the forwardZones
configuration for the cray-dns-unbound
service to point to the cray-dns-powerdns
service.
yq write -s - -i ${SITE_INIT}/customizations.yaml <<EOF
- command: update
path: spec.kubernetes.services.cray-dns-unbound.forwardZones
value:
- name: "."
forwardIps:
- "10.92.100.85"
EOF
(pit#
) Review the cray-dns-unbound
values.
IMPORTANT
Do not remove thedomain_name
entry, it is required for Unbound to forward requests to PowerDNS correctly.
yq read "${SITE_INIT}/customizations.yaml" spec.kubernetes.services.cray-dns-unbound
Expected output:
domain_name: '{{ network.dns.external }}'
forwardZones:
- name: "."
forwardIps:
- "10.92.100.85"
See the following documentation regarding known issues when operating with no upstream DNS server.
(Optional) Configure PowerDNS zone transfer and DNSSEC. See the PowerDNS Configuration Guide for more information.
If zone transfer is to be configured, then review customizations.yaml
and ensure that the primary_server
, secondary_servers
, and notify_zones
values are set correctly.
If DNSSEC is to be used, then add the desired keys into the dnssec
SealedSecret.
Thanos
S3 bucketBy default, there is NO retention policy set for thanos
object storage data.
This means that data is retained forever. Retention can be configured by using the --retention.resolution-raw
, --retention.resolution-5m
and --retention.resolution-1h
flags.
Not setting these flags or setting them to 0 second means no retention policy is configured.
The thanos
object storage is deployed by the cray-sysmgmt-health
chart to the sysmgmt-health
namespace.
In order to set the storage limits for thanos
S3
bucket, configure the thanosCompactor
settings for the cray-sysmgmt-health
chart in customizations.yaml
.
(pit#
) Set thanosCompactor
in customization.yaml
.
yq write -s - -i "${SITE_INIT}/customizations.yaml" <<EOF
- command: update
path: spec.kubernetes.services.cray-sysmgmt-health.thanosCompactor
value:
resolutionraw: 15d
resolution5m: 15d
resolution1h: 15d
EOF
(pit#
) Review the thanosCompactor
values.
yq read "${SITE_INIT}/customizations.yaml" spec.kubernetes.services.cray-sysmgmt-health.thanosCompactor
Example output is:
resolutionraw: 15d
resolution5m: 15d
resolution1h: 15d
NOTE The recommended storage limit to configure for
thanos
is15d - 30d
for all resolutions. As a rule of thumb retention for each downsampling level should be the same, and should be greater than the maximum date range (10 days for5m to 1h
downsampling).
The Prometheus SNMP exporter needs to be configured with a list of management network switches to scrape metrics from in order to populate the System Health Service Grafana dashboards.
NOTE that for the Prometheus SNMP exporter to work, SNMP needs to be configured on the management switches and the username and password need to match both Vault and the matching sealed secret in customizations.yaml. See the Prometheus SNMP Exporter page for more information and review the Adding SNMP Credentials to the System section for links to the relevant procedures.
(pit#
) Load the zeromq
container image required by Sealed Secret Generators.
NOTE
Requires a properly configured Docker or Podman environment.
"${CSM_PATH}/hack/load-container-image.sh" artifactory.algol60.net/csm-docker/stable/docker.io/zeromq/zeromq:v4.0.5
(pit#
) Re-encrypt existing secrets.
"${SITE_INIT}/utils/secrets-reencrypt.sh" \
"${SITE_INIT}/customizations.yaml" \
"${SITE_INIT}/certs/sealed_secrets.key" \
"${SITE_INIT}/certs/sealed_secrets.crt"
It is not an error if this script gives no output.
(pit#
) Generate secrets.
"${SITE_INIT}/utils/secrets-seed-customizations.sh" "${SITE_INIT}/customizations.yaml"
Leave the site-init
directory.
cd "${PITDATA}"
site-init
is now prepared. Resume Initialize the LiveCD.
Customer-specific customizations are any changes on top of the baseline configuration to satisfy customer-specific requirements. It is recommended that customer-specific customizations be tracked on branches separate from the mainline in order to make them easier to manage.
Apply any customer-specific customizations by merging the corresponding
branches into master
branch of site-init
.
When considering merges, and especially when resolving conflicts, carefully examine differences to ensure all changes are relevant. For example, when applying a customer-specific customization used in a prior version, be sure the change still makes sense. It is common for options to change as new features are introduced and bugs are fixed.