The Boot Orchestration Service (BOS) creates a session when it is asked to perform an operation on a session template. Sessions provide a way to track the status of many nodes at once as they perform the same operation with the same session template information. When creating a session, both the operation and session template are required parameters.
The v2 version of BOS supports these operations:
See Manage a BOS Session for more information on creating and managing BOS sessions.
In BOS v2, components are managed independently. Each component has an associated record in BOS that contains some information about the desired and current state. Several operators monitor for components that require actions to be taken and trigger the associated action.
Session in BOS v2 are constructs intended to help users track an operation across multiple components at once. A session will continue to track a component until the component reaches its desired state and is disabled or until another sessions takes over managing that component. Additional status information is also available for sessions to track information such as how many components are at each step in the process. See View the Status of a BOS Session for more information.
In CSM 1.5.0 on large systems, BOS v2 sessions may not work properly because of a failed interaction with CFS. For more information, see CFS V2 Failures On Large Systems.
The v1 version of BOS supports these operations:
See Manage a BOS Session for more information on creating and managing BOS sessions.
The Boot Orchestration Agent (BOA) is a Kubernetes job that manages all the components until the session is complete. If there are transient failures, BOA will exit and Kubernetes will reschedule it so that it can re-execute its session.
BOA moves nodes towards the requested state, but if a node fails during any of the intermediate steps, it takes note of it. BOA will then provide a command in the output of the BOA log that can be used to retry the action.
For example, if there is a 6,000 node system and 3 nodes fail to power off during a BOS operation. then BOA will continue and attempt to re-provision the remaining 5,997 nodes. After the command is finished, it will provide information about what the administrator needs to do in order to retry the operation on the 3 nodes that failed.
The following limitations currently exist with BOS sessions: