The 2.4.13 version of the SAT product includes:
sat
python package and CLI.sat-podman
wrapper script.sat-install-utility
container image.cfs-config-util
container image.Because of installation refactoring efforts, the following two components are no longer delivered with SAT:
sat-cfs-install
container imagesat-cfs-install
Helm chartA version of the cray-sat
container image is now included in CSM. For more
information, see SAT in CSM.
The SAT install.sh
script no longer uses a sat-cfs-install
Helm chart and
container image to upload its Ansible content to the sat-config-management
repository in VCS. Instead, it uses Podman to run the cf-gitea-import
container
directly. Some of the benefits of this change include the following:
cray-sat
container image and cray-sat-podman
packagecray-sat
Container Image and cray-sat-podman
PackageIn older SAT releases, the sat
wrapper script that was provided by the
cray-sat-podman
package installed on Kubernetes master NCNs included a
hard-coded version of the cray-sat
container image. As a result, every new
version of the cray-sat
image required a corresponding new version of the
cray-sat-podman
package.
In this release, this tight coupling of the cray-sat-podman
package and the
cray-sat
container image was removed. The sat
wrapper script provided
by the cray-sat-podman
package now looks for the version of the cray-sat
container image in the /opt/cray/etc/sat/version
file. This file is populated
with the correct version of the cray-sat
container image by the SAT layer of
the CFS configuration that is applied to management NCNs. If the version
file
does not exist, the wrapper script defaults to the version of the cray-sat
container image delivered with the latest version of CSM installed on the system.
The steps for performing NCN personalization as part of the SAT installation
were moved out of the install.sh
script and into a new
update-mgmt-ncn-cfs-config.sh
script that is provided in the SAT release
distribution. The new script provides additional flexibility in how it modifies
the NCN personalization CFS configuration for SAT. It can modify an existing CFS
configuration by name, a CFS configuration being built in a JSON file, or an
existing CFS configuration that applies to certain components. For more information,
see Perform NCN Personalization.
sat bootprep
FeaturesThe following new features were added to the sat bootprep
command:
Variable substitutions using Jinja2 templates in certain fields of the
sat bootprep
input file
For more information, see HPC CSM Software Recipe Variable Substitutions and Dynamic Variable Substitutions.
Schema version validation in the sat bootprep
input files
For more information, see Providing a Schema Version.
Ability to look up images and recipes provided by products
For more information, see Defining IMS Images.
The schema of the sat bootprep
input files was also changed to support these
new features:
base
key instead of under an ims
key. The old ims
key is deprecated.base.image_ref
.
You should no longer use the IMS name of the image on which it depends.image.ims.name
, image.ims.id
, or image.image_ref
. Specifying a string
value directly under the image
key is deprecated.For more information on defining IMS images and BOS session templates in the
sat bootprep
input file, see Defining IMS Images
and Defining BOS Session Templates.
sat swap
The sat swap
command was updated to support swapping compute and UAN blades
with sat swap blade
. This functionality is described in the following processes
of the Cray System Management Documentation:
v2
A new v2
version of the Boot Orchestration Service (BOS) is available in CSM
1.3.0. SAT has added support for BOS v2
. This impacts the following commands
that interact with BOS:
sat bootprep
sat bootsys
sat status
By default, SAT uses BOS v1
. However, you can choose the BOS version you want
to use. For more information, see Change the BOS Version.
sat status
When using BOS v2
, sat status
outputs additional fields. These fields show
the most recent BOS session, session template, booted image, and boot status for
each node. An additional --bos-fields
option was added to limit the output of
sat status
to these fields. The fields are not displayed when using BOS v1
.
This is the first release of SAT built from open source code repositories. As a result, build infrastructure was changed to use an external Jenkins instance, and artifacts are now published to an external Artifactory instance. These changes should not impact the functionality of the SAT product in any way.
paramiko
Python package version was updated from 2.9.2 to 2.10.1 to
mitigate CVE-2022-24302.oauthlib
Python package version was updated from 3.2.0 to 3.2.1 to
mitigate CVE-2022-36087.SAT stores information used to authenticate to the API gateway with Keycloak.
Token files are stored in the ~/.config/sat/tokens/
directory. Those files
have always had permissions appropriately set to restrict them to be readable
only by the user.
Keycloak usernames used to authenticate to the API gateway are stored in the
SAT config file at /.config/sat/sat.toml
. Keycloak usernames are also used in
the file names of tokens stored in /.config/sat/tokens
. As an additional
security measure, SAT now restricts the permissions of the SAT config file
to be readable and writable only by the user. It also restricts the tokens
directory and the entire SAT config directory ~/.config/sat
to be accessible
only by the user. This prevents other users on the system from viewing
Keycloak usernames used to authenticate to the API gateway.
sat init
did not print a message confirming a new
configuration file was created.sat showrev
exited with a traceback if the file
/opt/cray/etc/site_info.yaml
existed but was empty. This could occur if the
user exited sat setrev
with Ctrl-C
.sat bootsys
man page, and added a
description of the command stages.The 2.3.4 version of the SAT product includes:
sat
python package and CLIsat-podman
wrapper scriptsat-cfs-install
container imagesat-cfs-install
Helm chartsat-install-utility
container imagecfs-config-util
container imagesat
CommandsNone.
When running sat
commands, the current working directory is now mounted in the
container as /sat/share
, and the current working directory within the container
is also /sat/share
.
Files in the current working directory must be specified using relative paths to
that directory, because the current working directory is always mounted on /sat/share
.
Absolute paths should be avoided, and paths that are outside of $HOME
or $PWD
are never accessible to the container environment.
The home directory is still mounted on the same path inside the container as it is on the host.
sat bootsys
The following options were added to sat bootsys
.
--bos-limit
--recursive
The --bos-limit
option passes a given limit string to a BOS session. The --recursive
option specifies a slot or other higher-level component in the limit string
sat bootprep
The --delete-ims-jobs
option was added to sat bootprep run
. It deletes IMS
jobs after sat bootprep
is run. Jobs are no longer deleted by default.
sat status
sat status
now includes information about nodes’ CFS configuration statuses, such
as desired configuration, configuration status, and error count.
The output of sat status
now splits different component types into different report tables.
The following options were added to sat status
.
--hsm-fields
, --sls-fields
, --cfs-fields
--bos-template
The --hsm-fields
, --sls-fields
, --cfs-fields
options limit the output columns
according to specified CSM services.
The --bos-template
option filters the status report according to the specified
session template’s boot sets.
The following components were modified to be compatible with CSM 1.2.
sat-cfs-install
container image and Helm chartsat-install-utility
container imageThe sat-ncn
Ansible role provided by sat-cfs-install
was modified to enable
GPG checks on packages while leaving GPG checks disabled on repository metadata.
Updated urllib3
dependency to version 1.26.5 to mitigate CVE-2021-33503 and refreshed
Python dependency versions.
Minor bug fixes were made in each of the repositories. For full change lists,
refer to each repository’s CHANGELOG.md
file.
The known issues listed under the SAT 2.2 release were fixed.
SAT 2.2.16 was released on February 25th, 2022.
This version of the SAT product included:
sat
python package and CLIsat-podman
wrapper scriptsat-cfs-install
container image and Helm chartIt also added the following new components:
sat-install-utility
container imagecfs-config-util
container imageThe following sections detail the changes in this release.
sat
Command Unavailable in sat bash
ShellAfter launching a shell within the SAT container with sat bash
, the sat
command will not
be found. For example:
(CONTAINER-ID) sat-container:~ # sat status
bash: sat: command not found
This can be resolved temporarily in one of two ways. /sat/venv/bin/
may be prepended to the
$PATH
environment variable:
(CONTAINER-ID) sat-container:~ # export PATH=/sat/venv/bin:$PATH
(CONTAINER-ID) sat-container:~ # sat status
Or, the file /sat/venv/bin/activate
may be sourced:
(CONTAINER-ID) sat-container:~ # source /sat/venv/bin/activate
(CONTAINER-ID) sat-container:~ # sat status
sat bash
ShellAfter launching a shell within the SAT container with sat bash
, tab completion for sat
commands does not work.
This can be resolved temporarily by sourcing the file /etc/bash_completion.d/sat-completion.bash
:
source /etc/bash_completion.d/sat-completion.bash
sat
in Root Directorysat
commands will not work if the current directory is /
. For example:
ncn-m001:/ # sat --help
Error: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: open /dev/console: operation not permitted: OCI runtime permission denied error
To resolve, run sat
in another directory.
sat
in Config Directorysat
commands will not work if the current directory is ~/.config/sat
. For example:
ncn-m001:~/.config/sat # sat --help
Error: /root/.config/sat: duplicate mount destination
To resolve, run sat
in another directory.
sat
Commandssat bootprep
automates the creation of CFS configurations, the build and
customization of IMS images, and the creation of BOS session templates. For
more information, see SAT Bootprep.sat slscheck
performs a check for consistency between the System Layout
Service (SLS) and the Hardware State Manager (HSM).sat bmccreds
provides a simple interface for interacting with the System
Configuration Service (SCSD) to set BMC Redfish credentials.sat hwhist
displays hardware component history by XName (location) or by
its Field-Replaceable Unit ID (FRUID). This command queries the Hardware
State Manager (HSM) API to obtain this information. Since the sat hwhist
command supports querying for the history of a component by its FRUID, the
FRUID of components has been added to the output of sat hwinv
.The following automation has been added to the install script, install.sh
:
sat-config-import
Kubernetes job, which is
started when the sat-cfs-install
Helm chart is deployed.ncn-personalization
).The SAT product uploads additional information to the cray-product-catalog
Kubernetes ConfigMap detailing the components it provides, including container
(Docker) images, Helm charts, RPMs, and package repositories.
This information is used to support uninstall and activation of SAT product versions moving forward.
Beginning with the 2.2 release, SAT now provides partial support for the uninstall and activation of the SAT product stream.
For more information, see Uninstall: Removing a Version of SAT and Activate: Switching Between Versions.
sat status
A Subrole
column has been added to the output of sat status
. This allows you
to easily differentiate between master, worker, and storage nodes in the
management role, for example.
Hostname information from SLS has been added to sat status
output.
Support for JSON-formatted output has been added to commands which currently
support the --format
option, such as hwinv
, status
, and showrev
.
Many usability improvements have been made to multiple sat
commands,
mostly related to filtering command output. The following are some highlights:
--fields
option to display only specific fields for subcommands which
display tabular reports.--filter
queries
so that the first match is used, similar to --sort-by
.--filter
, --fields
, and --reverse
for summaries
displayed by sat hwinv
.sat hwinv
.The default log level for stderr
has been changed from “WARNING” to “INFO”. For
more information, see SAT Logging.
With the command-line options --loglevel-stderr
and --loglevel-file
, the log level
can now be configured separately for stderr
and the log file.
The existing --loglevel
option is now an alias for the --loglevel-stderr
option.
The Podman wrapper script is the script installed at /usr/bin/sat
on the
master management NCNs by the cray-sat-podman
RPM that runs the cray-sat
container in podman
. The following subsections detail improvements that were
made to the wrapper script in this release.
cray-sat
ContainerThe Podman wrapper script that launches the cray-sat
container with podman
has been modified to mount the user’s current directory and home directory into
the cray-sat
container to provide access to local files in the container.
The man page for the Podman wrapper script, which is accessed by typing man sat
on a master management NCN, has been improved to document the following:
Fixed issues with redirecting stdout
and stderr
, and piping output to commands,
such as awk
, less
, and more
.
A new sat
option has been added to configure the HTTP timeout length for
requests to the API gateway. For more information, refer to sat-man sat
.
sat bootsys
ImprovementsMany improvements and fixes have been made to sat bootsys
. The following are some
highlights:
--excluded-ncns
option, which can be used to omit NCNs
from the platform-services
and ncn-power
stages in case they are
inaccessible.sat bootsys shutdown
now prompt the user to
continue before proceeding. A new option, --disruptive
, will bypass this.platform-services
stage of sat bootsys boot
.sat xname2nid
Improvementssat xname2nid
can now recursively expand slot, chassis, and cabinet XNames to
a list of NIDs in those locations.
A new --format
option has been added to sat xname2nid
. It sets the output format to
either “range” (the default) or “NID”. The “range” format displays NIDs in a
compressed range format suitable for use with a workload manager like Slurm.
v2
HSM APIThe commands which interact with HSM (for example, sat status
and sat hwinv
) now
use the v2
HSM API.
sat diag
Limited to HSN Switchessat diag
will now only operate against HSN switches by default. These are the
only controllers that support running diagnostics with HMJTD.
sat showrev
EnhancementsA column has been added to the output of sat showrev
that indicates whether a
product version is “active”. The definition of “active” varies across products,
and not all products may set an “active” version.
For SAT, the active version is the one with its hosted-type package repository in
Nexus set as the member of the group-type package repository in Nexus,
meaning that it will be used when installing the cray-sat-podman
RPM.
cray-sat
Container Image Size ReductionThe size of the cray-sat
container image has been approximately cut in half by
leveraging multi-stage builds. This also improved the repeatability of the unit
tests by running them in the container.
Minor bug fixes were made in cray-sat
and in cray-sat-podman
. For full change lists,
refer to each repository’s CHANGELOG.md
file.
We released version 2.1.16 of the SAT product in Shasta v1.5.
This version of the SAT product included:
sat
python package and CLIsat-podman
wrapper scriptIt also added the following new component:
sat-cfs-install
docker image and helm chartThe following sections detail the changes in this release.
This release further decouples the installation of the SAT product from the CSM
product. The cray-sat-podman
RPM is no longer installed in the management
non-compute node (NCN) image. Instead, the cray-sat-podman
RPM is installed on
all master management NCNs via an Ansible playbook which is referenced by a
layer of the CFS configuration that applies to management NCNs. This CFS
configuration is typically named ncn-personalization
.
The SAT product now includes a Docker image and a Helm chart named
sat-cfs-install
. The SAT install script, install.sh
, deploys the Helm chart
with Loftsman. This helm chart deploys a Kubernetes job that imports the
SAT Ansible content to a git repository in VCS (Gitea) named sat-config-management
.
This repository is referenced by the layer added to the NCN personalization
CFS configuration.
All commands which used to access Redfish directly have either been removed or modified to use higher-level service APIs. This includes the following commands:
sat sensors
sat diag
sat linkhealth
The sat sensors
command has been rewritten to use the SMA telemetry API to
obtain the latest sensor values. The command’s usage has changed slightly, but
legacy options work as before, so it is backwards compatible. Additionally, new
commands have been added.
The sat diag
command has been rewritten to use a new service called Fox, which
is delivered with the CSM-Diags product. The sat diag
command now launches
diagnostics using the Fox service, which launches the corresponding diagnostic
programs on controllers using the Hardware Management Job and Task Daemon
(HMJTD) over Redfish. Essentially, Fox serves as a proxy for us to start
diagnostics over Redfish.
The sat linkhealth
command has been removed. Its functionality has been
replaced by functionality from the Slingshot Topology Tool (STT) in the
fabric manager pod.
The Redfish username and password command line options and config file options have been removed. For more information, see Remove Obsolete Configuration File Sections.
sat setrev
and sat showrev
sat setrev
now collects the following information from the admin, which is then displayed by sat showrev
:
Additional guidance and validation has been added to each field collected by
sat setrev
. This sets the stage for sdu setup
to stop collecting this
information and instead collect it from sat showrev
or its S3 bucket.
sat bootsys
The platform-services
stage of the sat bootsys boot
command has been
improved to start inactive Ceph services, unfreeze Ceph, and wait for Ceph
health in the correct order. The ceph-check
stage has been removed as it is no
longer needed.
The platform-services
stage of sat bootsys
boot now prompts for confirmation
of the storage NCN hostnames in addition to the Kubernetes masters and workers.
sat firmware
.cray-sat
container image.sat firmware
command.We released version 2.0.4 of the SAT product in Shasta v1.4.1.
This version of the SAT product included:
sat
python package and CLI.sat-podman
wrapper script.The following sections detail the changes in this release.
Two new commands were added to translate between NIDs and XNames:
sat nid2xname
sat xname2nid
These commands perform this translation by making requests to the Hardware State Manager (HSM) API.
sat swap
where creating the offline port policy failed.sat bootsys shutdown --stage bos-operations
to no longer forcefully
power off all compute nodes and application nodes using CAPMC when BOS
sessions complete or time out.sat bootsys boot --stage cabinet-power
.In Shasta v1.4, SAT became an independent product, which meant we began to designate a version number for the entire SAT product. We released version 2.0.3 of the SAT product in Shasta v1.4.
This version of the SAT product included the following components:
sat
python package and CLIIt also added the following new component:
sat-podman
wrapper scriptThe following sections detail the changes in this release.
SAT is now packaged and released as an independent product. The product
deliverable is called a “release distribution”. The release distribution is a
gzipped tar file containing an install script. This install script loads the
cray/cray-sat
container image into the Docker registry in Nexus and loads the
cray-sat-podman
RPM into a package repository in Nexus.
In this release, the cray-sat-podman
package is still installed in the master
and worker NCN images provided by CSM. This is changed in SAT 2.1.16 released in
Shasta v1.5.
The sat
command now runs in a container under Podman. The sat
executable is
now installed on all nodes in the Kubernetes management cluster (workers and
masters). This executable is a wrapper script that starts a SAT container in
Podman and invokes the sat
Python CLI within that container. The admin can run
individual sat
commands directly on the master or worker NCNs as before, or
they can run sat
commands inside the SAT container after using sat bash
to
enter an interactive shell inside the SAT container.
To view man pages for sat
commands, the user can run sat-man SAT_COMMAND
,
replacing SAT_COMMAND
with the name of the sat
command. Alternatively,
the user can enter the sat
container with sat bash
and use the man
command.
sat init
Command and Config File Location ChangeThe default location of the SAT config file has been changed from /etc/sat.toml
to ~/.config/sat/sat.toml
. A new command, sat init
, has been added that
initializes a configuration file in the new default directory. This better supports
individual users on the system who want their own config files.
~/.config/sat
is mounted into the container that runs under Podman, so changes
are persistent across invocations of the sat
container. If desired, an alternate
configuration directory can be specified with the SAT_CONFIG_DIR
environment variable.
Additionally, if a config file does not yet exist when a user runs a sat
command, one is generated automatically.
sat hwinv
Additional functionality has been added to sat hwinv
including:
--list-node-enclosure-power-supplies
option.--list-node-accels
option. The
count of node accelerators is also included for each node.--list-node-accel-risers
option. The count of node accelerator risers is also included for each node.--list-node-hsn-nics
option. The count of HSN NICs is also included for each node.Documentation for these new options has been added to the man page for sat hwinv
.
sat setrev
in S3The sat setrev
and sat showrev
commands now use S3 to store and obtain site
information, including system name, site name, serial number, install date, and
system type. Since the information is stored in S3, it will now be consistent
regardless of the node on which sat
is executed.
As a result of this change, S3 credentials must be configured for SAT. For more information, see Generate SAT S3 Credentials.
sat showrev
sat showrev
now shows product information from the cray-product-catalog
ConfigMap in Kubernetes.
sat showrev
The output from sat showrev
has also been changed in the following ways:
--docker
and --packages
options were considered misleading and have
been removed.--local
option.sat cablecheck
The sat cablecheck
command has been removed. To verify that the system’s Slingshot
network is cabled correctly, admins should now use the show cables
command in the
Slingshot Topology Tool (STT).
sat swap
Command Compatibility with Next-gen Fabric ControllerThe sat swap
command was added in Shasta v1.3.2. This command used the Fabric
Controller API. Shasta v1.4 introduced a new Fabric Manager API and removed the
Fabric Controller API, so this command has been rewritten to use the new
backwards-incompatible API. Usage of the command did not change.
sat bootsys
FunctionalityMuch of the functionality added to sat bootsys
in Shasta v1.3.2 was broken
by changes introduced in Shasta v1.4, which removed the Ansible inventory
and playbooks.
The functionality in the platform-services
stage of sat bootsys
has been
re-implemented to use python directly instead of Ansible. This resulted in
a more robust procedure with better logging to the sat
log file. Failures
to stop containers on Kubernetes nodes are handled more gracefully, and
more information about the containers that failed to stop, including how to
debug the problem, is included.
Improvements were made to console logging setup for non-compute nodes (NCNs) when they are shut down and booted.
The following improvements were made to the bos-operations
stage
of sat bootsys
:
--bos-templates
, and a corresponding config-file
option, bos_templates
, were added, and the --cle-bos-template
and
--uan-bos-template
options and their corresponding config file options were
deprecated.The following functionality has been removed from sat bootsys
:
hsn-bringup
stage of sat bootsys boot
has been removed due to removal
of the underlying Ansible playbook.bgp-check
stage of sat bootys {boot,shutdown}
has been removed. It is
now a manual procedure.The location of the sat log file has changed from /var/log/cray/sat.log
to
/var/log/cray/sat/sat.log
. This change simplifies mounting this file into the
sat container running under Podman.
Shasta v1.3.2 included version 2.4.0 of the sat
python package and CLI.
The following sections detail the changes in this release.
sat swap
Command for Switch and Cable ReplacementThe sat switch
command which supported operations for replacing a switch has
been deprecated and replaced with the sat swap
command, which now supports
replacing a switch OR cable.
The sat swap switch
command is equivalent to sat switch
. The sat switch
command will be removed in a future release.
sat bootsys
CommandThe sat bootsys
command now has multiple stages for both the boot
and
shutdown
actions. Please refer to the “System Power On Procedures” and “System
Power Off Procedures” sections of the Cray Shasta Administration Guide (S-8001)
for more details on using this command in the context of a full system power off
and power on.
Shasta v1.3 included version 2.2.3 of the sat
python package and CLI.
This version of the sat
CLI contained the following commands:
auth
bootsys
cablecheck
diag
firmware
hwinv
hwmatch
k8s
linkhealth
sensors
setrev
showrev
status
swap
switch
For more information on each of these commands, see the System Admin Toolkit Command Overview and the table of commands in the SAT Authentication section of this document.