Configuration of UAN nodes is performed by the Configuration Framework Service (CFS). CFS can apply configuration to both images and nodes. When the configuration is applied to nodes, the nodes must be booted and accessible through SSH over the Node Management Network (NMN).
The preferred method of creating CFS configurations is to use the Shasta Admin Toolkit (SAT) sat bootprep
command. This command automates the creation of IMS images, CFS configurations, and BOS session templates. See Create UAN Boot Images for more details.
CFS uses configuration layers. Configuration layers allow the sharing of Ansible roles provided by other products, and by the site. Non-root user access may be blocked during node configuration by enabling the uan-set-nologin
and uan-unset-nologin
configuration layers shown in the example bootprep file below. The parameterized fields are defined in a products_vars.yml
file.
IMPORTANT Do not remove or reorder the first three layers. The UAN product requires these layers and this specific order. Also, keep the required cos-application-last
layer, is as the last or second to last layer in the configuration if uan-set-nologin
and uan-unset-nologin
are active.
- name: "{{default.note}}uan-{{recipe.version}}{{default.suffix}}"
layers:
# The first three layers are required and must not be reordered.
- name: shs-{{default.network_type}}_install-{{slingshot_host_software.working_branch}}
playbook: shs_{{default.network_type}}_install.yml
product:
name: slingshot-host-software
version: "{{slingshot_host_software.version}}"
branch: "{{slingshot_host_software.working_branch}}"
- name: cos-application-{{cos.working_branch}}
playbook: cos-application.yml
product:
name: cos
version: "{{cos.version}}"
branch: "{{cos.working_branch}}"
- name: csm-packages-{{csm.version}}
playbook: csm_packages.yml
product:
name: csm
version: "{{csm.version}}"
# Optional layer to prevent non-root logins during configuration
# - name: uan-set-nologin-{{uan.working_branch}}
# playbook: set_nologin.yml
# product:
# name: uan
# version: "{{uan.version}}"
# branch: "{{uan.working_branch}}"
- name: uan-{{uan.working_branch}}
playbook: site.yml
product:
name: uan
version: "{{uan.version}}"
branch: "{{uan.working_branch}}"
### additional CFS layers here...
# cos-application-last is required to be the last or second to last layer
# when uan-set-nologin and uan-unset-nologin layers are active.
- name: cos-application-last-{{cos.working_branch}}
playbook: cos-application-after.yml
product:
name: cos
version: "{{cos.version}}"
branch: "{{cos.working_branch}}"
# Optional layer to allow non-root logins after configuration
# - name: uan-unset-nologin-{{uan.working_branch}}
# playbook: unset_nologin.yml
# product:
# name: uan
# version: "{{uan.version}}"
# branch: "{{uan.working_branch}}"
The first CFS Layer installs the Slingshot Host Software for the Slingshot network type of the system. The default.network_type
is:
mellanox
for ConnectX-5 NICs used in Slingshot 10cassini
for Slingshot NICs in Slingshot 11The name of the playbook must match the name of the HSN NICs (mellanox or cassini) in the UAN nodes. Additionally, the HSN NICs must be of the same type as the NCN and Compute nodes.
The second CFS Layer runs the following roles from the cos-config-management
VCS repository. Any configuration changes needed for these roles must be made in the cos-config-management
group_vars
or host_vars
subdirectories of that repository.
The following Ansible roles are run during UAN image configuration:
Standard UNIX configuration
cos-config-map
rsyslog
localtime
ntp
limits
kdump
Allow trust of CSM generated keys for elective passwordless SSH during image customization
trust-csm-ssh-keys
HPE Cray EX system configurations
overlay-preload
Install COS rpms
cos-services-install
**DVS/LNET/FS
cray_lnet_install
cray_dvs_install
cray_lnet_load
cray_dvs_load
The following Ansible roles are run during UAN post-boot configuration:
Standard UNIX configuration
rsyslog
HPE Cray EX system configurations
ca-cert
overlay-preload
GPU deploy support
cray_gpu_deploy
The third CFS Layer installs the the Cray Management System packages. These are normally not modified.
This optional layer is recommended when the nodes will host non-root users. The layer touches the /etc/nologin
file preventing non-root users from logging into the node while it is being configured. Be sure to include the optional UAN layer which calls the unset_nologin.yml
playbook as indicated in this document if non-root users are to be allowed to login after the node is configured.
The Ansible roles involved in the UAN layer of the configuration are listed in the site.yml file in the uan-config-management git repository in VCS. Most of the roles that are specific to image configuration are required for the operation as a UAN and must not be removed from site.yml.
The UAN-specific roles involved in post-boot UAN node configuration are:
uan_disk_config
: this role configures the last disk found on the UAN that is smaller than 1TB, by default. That disk will be formatted with a scratch and swap partition mounted at /scratch and /swap, respectively. Each partition is 50% of the disk.
uan_packages
: this role installs any RPM packages listed in the uan-config-management repo.
uan_interfaces
: this role configures the UAN node networking. By default, this role does not configure a default route or the Customer Access Network (CAN or CHN) connection for the HPE Cray EX supercomputer. If CAN or CHN is enabled, the default route will be on the CAN or CHN. Otherwise, a default route must be set up in the customer interfaces definitions. Without the CAN or CHN, there will not be an external connection to the customer site network unless one is defined in the customer interfaces. See Configure Interfaces on UANs.
NOTE: If a UAN layer is used in the Compute node CFS configuration, the uan_interfaces
role will configure the default route on Compute nodes to be on the HSN, if the BICAN System Default Route is set to CHN
.
uan_motd
: this role Provides a default message of the day that can be customized by the administrator.
uan_ldap
: this optional role configures the connection to LDAP servers. To disable this role, the administrator must set ‘uan_ldap_setup:no’ in the ‘uan-config-management’ VCS repository.
uan_hardening
: This role configures site/customer-defined network security of UANs, for example, preventing ssh out of the UAN over the NMN to NCN nodes.
The UAN roles in site.yml
are required and must not be removed, with exception of uan_ldap
if the site is using some other method of user authentication. The uan_ldap
may also be skipped by setting the value of uan_ldap_setup
to no
in a group_vars
or host_vars
configuration file. Configuration of this layer is made in the uan-config-management
VCS repository.
This CFS Layer runs the following roles from the cos-config-management
VCS repository. Any configuration changes needed for these roles must be made in the group_vars
or host_vars
subdirectories of that repository.
The following Ansible roles are run during UAN image configuration:
rebuilt-initrd
The following Ansible roles are run during UAN post-boot configuration:
configure_fs
This UAN layer deletes the /etc/nologin
file allowing non-root users to log into the UAN. If the optional UAN layer that runs set_nologin.yml
was used, this layer must be used or only the root user will have acces to the node.