The Boot Orchestration Service (BOS) currently supports API version v2. The following is a summary of the changes BOS v2 made from v1, and the upgrade path from v1 to v2.
BOS v1 is removed in CSM 1.6. During the upgrade to CSM 1.6, all BOS v1 session data is deleted. Other BOS data may be modified or, in rare cases, deleted. See BOS data notice for more details.
BOS v2 makes significant improvements to boot times, retries, and error handling, by allowing nodes to proceed through the boot process at their own pace. Nodes that require extra time or retries to complete will no longer hold up the rest of the nodes, meaning that more nodes will reach a ready state faster.
The v2 endpoints have plural nouns, using sessions
and sessiontemplates
rather than session
and sessiontemplate
. This is more consistent with other CSM APIs.
BOS v2 also introduces two new endpoints:
/components
endpoint for tracking state at the component level. See Components for details./options
endpoint to allow changing global options dynamically. See Options for details.There are some different API responses in BOS v2 as well:
For more information on the BOS API, see BOS API.
To upgrade to BOS v2, users only need to start specifying v2 in the CLI or API. Scripts may also need to be updated to use the plural sessions
and sessiontemplates
, and to account for the other differences noted in the previous section.
When BOS is upgraded to a version that supports the v2 endpoints, it will automatically migrate all session templates to a v2 form.
BOS v1 versions of the session templates will also remain with _v1_deprecated
appended to their names, although the converted v2 session templates can still be used with BOS v1.
When a session was created in BOS v1, BOS started a Kubernetes job called the Boot Orchestration Agent (BOA), which managed all of the nodes. BOA moved through one phase at a time, ensuring that all nodes had completed the phase before moving on to the next. This meant that all nodes proceeded in lock step.
BOS v2 does away with BOA, replacing it with long-running operators that are each responsible for moving nodes through a particular phase transition. These operators monitor nodes individually, not as a group, allowing each node to proceed at its own pace, improving retry handling and speeding up the overall booting process.
If no version is specified, the BOS CLI defaults to the v2
endpoints. cray bos <command>
defaults to cray bos v2 <command>
.
To avoid compatibility issues when the CLI’s default version changes, scripts using the CLI should always explicitly specify a version.
The behavior of defaulting to a version when the version parameter is omitted is a convenience intended for interactive use and is not intended for scripts.