BOS operators are a feature of BOS v2 only.
BOS v2 has many different operators. Each operator is responsible for doing a single basic task – for example, powering on nodes, discovering new nodes on the system and creating BOS components for them, or initializing a pending BOS v2 session.
The work for a BOS v2 session is done by several different operators, all acting independently of each other. These operators are always running, even when no v2 session is actively underway. This is a big difference from BOS v1, where each session created a new dedicated Kubernetes job that handled everything associated with that v1 session.
Each operator follows the same basic execution loop:
Some BOS Options apply only to specific operators – these are noted in the relevant operator descriptions in the Operator list. The following BOS options apply to operators generally:
logging_level
polling_frequency
discovery operator, which
instead uses the discovery_frequency option for its sleep interval.max_component_batch_size
bss_read_timeout,
cfs_read_timeout,
hsm_read_timeout, and
capmc_read_timeout
See Options for more information.
(ncn-mw#) The BOS operators run in Kubernetes pods in the services namespace.
kubectl get pods -n services | grep '^cray-bos-operator-'
Example output:
cray-bos-operator-actual-state-cleanup-596dc4766c-xsdg2 2/2 Running 0 50d
cray-bos-operator-configuration-865f95f7d7-2j2tf 2/2 Running 0 50d
cray-bos-operator-discovery-698b44f9f9-fhs9c 2/2 Running 0 50d
cray-bos-operator-power-off-forceful-666f76c98f-mx5vb 2/2 Running 0 50d
cray-bos-operator-power-off-graceful-6489689c99-wch2p 2/2 Running 0 50d
cray-bos-operator-power-on-7d778c67cc-8vbrj 2/2 Running 0 50d
cray-bos-operator-session-cleanup-68c4cdbcc-qfvvr 2/2 Running 0 50d
cray-bos-operator-session-completion-756b4ddfb5-584bk 2/2 Running 0 50d
cray-bos-operator-session-setup-654544c589-9t9rr 2/2 Running 0 50d
cray-bos-operator-status-7665867877-gcj59 2/2 Running 0 50d
actual-state-cleanupconfigurationdiscoverypower-off-forcefulpower-off-gracefulpower-onsession-cleanupsession-completionsession-setupstatusactual-state-cleanupThis operator clears the actual_state field for components when the field has not been updated within a specified time.
This ensures that BOS keeps accurate information on the state of all components.
The time limit is controlled by the component_actual_state_ttl option.
WARNING: Unlike the
cleanup_completed_session_ttloption, a zero value for thecomponent_actual_state_ttloption will not disable the cleanup behavior. For details, seecomponent_actual_state_ttl.
configurationThis operator is responsible for setting the desired configuration in the
Configuration Framework Service (CFS)
for components that are in the configuring phase of the boot process.
Typically, this operator has nothing to do, because the power-on operator sets the desired configuration prior
to booting components. The exception is when a node is already booted and configured, and a BOS session is created to boot
(not reboot) the node using the same boot artifacts, but a different CFS configuration. In this case, the power-on operator
will never be called, and instead the configuration operator will take care of it.
discoveryThis operator checks the Hardware State Manager (HSM) to discover new nodes. If any are found, it creates BOS component records for them.
For its execution loop, this operator sets its sleep interval to the
discovery_frequency option. See Options for more information.
power-off-forcefulThis operator calls Cray Advanced Platform Monitoring and Control (CAPMC) to forcefully power off components when a previous power off action fails to power off the component.
power-off-gracefulThis operator calls CAPMC to gracefully power off components for components that have a power-off-pending status.
power-onFor each enabled BOS component that has a power-on-pending status, this operator does the following:
Writes the kernel, kernel parameters, and initrd to BSS
and records the bss-referral-token that is sent back by BSS.
For more information on the information that is being written to BSS, see Upload Node Boot Information to Boot Script Service (BSS).
Patches the node in CFS to disable it, clear its state, and set its desired configuration.
Calls CAPMC to power on the node.
Unlike all parts of BOS other than the API server, this operator directly accesses a BOS database. Specifically, after the BSS step in the above procedure, the operator writes an entry in the boot artifacts database. The key for the entry is the BSS token. The value of the entry is a dictionary containing the kernel, kernel, parameters, and
initrd. This is the only case where this operator directly interacts with any BOS database; all other interactions go through the BOS API, like usual.
session-cleanupThis operator deletes completed v2 sessions from BOS that are older than a specified age.
The age is controlled by the cleanup_completed_session_ttl option.
If that option has a zero value, then this cleanup behavior is disabled.
session-completionFor each running BOS v2 session, this operator checks to see if any BOS components are associated with that session and still have work (or staged work) to be done. If not, then it marks the session as complete and saves a final status for the session.
More specifically, for a given running session, the operator looks for all components which meet either of the following criteria:
session field is set to the name of the session
staged_state.session field set to the name of the session
If the clear_stage is set to true, then BOS will not clear the staged
state of nodes after applying the staged state. This in turn
will mean that the associated staged session will never be marked complete by the session-completion operator.
session-setupThis operator monitors for pending v2 sessions and moves them into the running state. For each pending session found, the following procedure is performed by the operator:
Contact HSM to get the following:
For each boot set in the session template, the operator does the following steps:
The target node list for the boot set starts empty.
If the boot set node_list field is set, add those components to the target list.
For any HSM groups specified in the node_groups field of the boot set,
add the associated components to the target list.
For any HSM roles or subroles specified in the node_roles_groups field,
add the associated components to the target list.
If a session limit was specified, apply it to the target list, removing any components which do not match the limit.
If the session include_disabled field is false, then remove any components that are disabled in HSM.
For each target component, determine what target state it should have in BOS. This is based on the session operation, the CFS settings in the session template, and the boot artifacts in the boot set.
If this is a non-staged session (staged field is false), then this will be used to determine the desired state.
If this is a staged session, then this will be used to determine the staged state.
For each component that was identified in the previous step, the BOS component record will be patched with the following changes:
If this is not a staged session, then the patch will also include the following:
session_setup.Patch the BOS session record to make the following changes:
status.status field to running.components field to a comma-separated list of the target components.Related: BOS v2 sessions and HSM locks.
statusThis operator is the workhorse that updates the status of components in BOS. For each component that is enabled in BOS, the status operator collects the following information:
The above information is used to determine whether or not the component should be disabled in BOS, and what the new component status should be. This determination is also impacted by the following options:
default_retry_policy
max_boot_wait_time
max_power_off_wait_time
max_power_on_wait_time
disable_components_on_completion
The source for the BOS operators is located in the
Cray-HPE/bos open source GitHub repository.