The following workflows present a high-level overview of common Boot Orchestration Service (BOS) operations. These workflows depict how services interact with each other when booting, configuring, or shutting down nodes. They also help provide a quicker and deeper understanding of how the system functions.
The following are mentioned in the workflows:
initrd
, image root) and boot parameters.The set of allowed operations for a session are:
boot
– Boot nodes that are powered offconfigure
– Reconfigure the nodes using the Configuration Framework Service (CFS)reboot
– Gracefully power down nodes that are on and then power them back upshutdown
– Gracefully power down nodes that are onThe following workflows are included in this document:
Use case: Administrator powers on and configures select compute nodes.
Components: This workflow is based on the interaction of the BOS with other services during the boot process:
Workflow overview: The following sequence of steps occurs during this workflow.
Administrator creates a configuration.
Add a configuration to CFS.
ncn-mw# cray cfs configurations update sample-config --file configuration.json --format json
Example output:
{
"lastUpdated": "2020-09-22T19:56:32Z",
"layers": [
{
"cloneUrl": "https://api-gw-service-nmn.local/vcs/cray/configmanagement.git",
"commit": "01b8083dd89c394675f3a6955914f344b90581e2",
"playbook": "site.yaml"
}
],
"name": "sample-config"
}
Administrator creates a session template.
A session template is a collection of data specifying a group of nodes, as well as the boot artifacts and configuration that should be applied to them. A session template can be created from a JSON structure. It returns a session template ID if successful.
See Manage a session template for more information.
Administrator creates a session.
Create a session to perform the operation specified in the operation request parameter on the boot set defined in the session template.
For this use case, the administrator creates a session with operation as boot
and specifies the session template ID.
ncn-mw# cray bos session create --template-uuid SESSIONTEMPLATE_NAME --operation boot
Launch BOA.
The creation of a session results in the creation of a Kubernetes BOA job to complete the operation. BOA coordinates with other services to complete the requested operation.
BOA to HSM.
BOA coordinates with HSM to validate node group and node status.
BOA to S3.
BOA coordinates with S3 to verify boot artifacts like kernel, initrd
, and root file system.
BOA to BSS.
BOA updates BSS with boot artifacts and kernel parameters for each node.
BOA to CAPMC.
BOA coordinates with CAPMC to power-on the nodes.
CAPMC boots nodes.
CAPMC interfaces directly with the Redfish APIs and powers on the selected nodes.
BSS interacts with the nodes.
BSS generates iPXE boot scripts based on the image content and boot parameters that have been assigned to a node. Nodes download the iPXE boot script from BSS.
S3 interacts with the nodes.
Nodes download the boot artifacts. The nodes boot using the boot artifacts pulled from S3.
BOA to HSM.
BOA waits for the nodes to boot up and be accessible via SSH. This can take up to 30 minutes. BOA coordinates with HSM to ensures that nodes are booted and Ansible can SSH to them.
BOA to CFS.
BOA directs CFS to apply post-boot configuration.
CFS applies configuration.
CFS runs Ansible on the nodes and applies post-boot configuration (also called node personalization). CFS then communicates the results back to BOA.
Use case: Administrator reconfigures compute nodes that are already booted and configured.
Components: This workflow is based on the interaction of the BOS with other services during the reconfiguration process.
Workflow overview: The following sequence of steps occurs during this workflow.
Administrator creates a session template.
A session template is a collection of data specifying a group of nodes, as well as the boot artifacts and configuration that should be applied to them. A session template can be created from a JSON structure. It returns a session template ID if successful.
See Manage a session template for more information.
Administrator creates a session.
Create a session to perform the operation specified in the operation request parameter on the boot set defined in the session template.
For this use case, the administrator creates a session with operation as configure
and specifies the session template ID.
ncn-mw# cray bos session create --template-uuid SESSIONTEMPLATE_NAME --operation configure
Launch BOA.
The creation of a session results in the creation of a Kubernetes BOA job to complete the operation. BOA coordinates with the underlying subsystem to complete the requested operation.
BOA to HSM.
BOA coordinates with HSM to validate node group and node status.
BOA to CFS.
BOA directs CFS to apply post-boot configuration.
CFS applies configuration.
CFS runs Ansible on the nodes and applies post-boot configuration (also called node personalization).
CFS to BOA.
CFS then communicates the results back to BOA.
Use cases: Administrator powers off selected compute nodes.
Components: This workflow is based on the interaction of the Boot Orchestration Service (BOS) with other services during the node shutdown process:
Workflow overview: The following sequence of steps occurs during this workflow.
Administrator creates a session template.
A session template is a collection of data specifying a group of nodes, as well as the boot artifacts and configuration that should be applied to them. A session template can be created from a JSON structure. It returns a session template ID if successful.
See Manage a session template for more information.
Administrator creates a session.
Create a session to perform the operation specified in the operation request parameter on the boot set defined in the session template.
For this use case, the administrator creates a session with operation as shutdown
and specifies the session template ID.
ncn-mw# cray bos session create --template-uuid SESSIONTEMPLATE_NAME --operation shutdown
Launch BOA.
The creation of a session results in the creation of a Kubernetes BOA job to complete the operation. BOA coordinates with the underlying subsystem to complete the requested operation.
BOA to HSM.
BOA coordinates with HSM to validate node group and node status.
BOA to CAPMC.
BOA directs CAPMC to power off the nodes.
CAPMC to the nodes.
CAPMC interfaces directly with the Redfish APIs and powers off the selected nodes.
CAPMC to BOA.
CAPMC communicates the results back to BOA.