The Boot Orchestration Service (BOS) supports a status endpoint that reports the status for individual BOS sessions. The status can be retrieved for each boot set within the session, as well as the individual items within a boot set.
In BOS v1, the status can be retrieved for each boot set within the session, as well as the individual items within a boot set.
BOS sessions contain one or more boot sets. Each boot set contains one or more phases, depending upon the operation for that session.
For example, a reboot
operation would have a shutdown
, boot
, and possibly configuration
phase, but a shutdown
operation would only have a shutdown
phase.
Each phase contains the following categories: not_started
, in_progress
, succeeded
, failed
, and excluded
.
Each session, boot set, and phase contains similar metadata. The following is a table of useful attributes to look for in the metadata:
Attribute | Meaning |
---|---|
start_time |
The time when a session, boot set, or phase started work. |
in_progress |
If true, it means that the session, boot set, or phase has started and still has work going on. |
complete |
If true, it means the session, boot set, or phase has finished. |
error_count |
The number of errors encountered in the boot sets or phases. |
stop_time |
The time when a session, boot set, or phase ended work. |
The following table summarizes how to interpret the various combinations of values for the in_progress
and complete
flags:
in_progress |
complete |
Meaning |
---|---|---|
false | false | Item has not started. |
true | false | Item is in progress. |
false | true | Item has completed. |
true | true | Invalid state (should not occur). |
The in_progress
, complete
, and error_count
fields are cumulative, meaning that they summarize the state of the sub-items.
Item | in_progress meaning |
complete meaning |
---|---|---|
Phase | If true, it means there is at least one node in the in_progress category. |
If true, it means that there are no nodes in the in_progress or not_started categories. |
Boot set | If true, it means there is at least one phase that is in_progress . |
If true, it means that all phases in the boot set are complete . |
Session | If true, it means that at least one boot set is in_progress . |
If true, it means that all boot sets are complete . |
The BOS session ID is required to view the status of a session. To list the available sessions, use the following command:
Note: If this command fails, there may be too many BOS sessions. For more information, see Hang Listing BOS Sessions.
ncn-mw# cray bos session list --format json
Example output:
[
"99a192c2-050e-41bc-a576-548610851742",
"4374f3e6-e8ed-4e66-bf63-3ebe0e618db2",
"fb14932a-a9b7-41b2-ad21-b4bc632cf1ef",
"9321ab7a-bf7f-42fd-8103-94a296552856",
"50aaaa85-6807-45c7-b6de-f984a930e2eb",
"972cfd09-3403-4282-ab93-b41992f7c0d8",
"2c86c1b9-5281-4610-b044-479f1536727a",
"7719385a-e462-4bb6-8fd8-55caa0836528",
"0aac0252-4637-4198-919f-6bafda7fafef",
"13207c87-0b9f-410c-88c1-6e26ff63cb34",
"bd18e7e3-978f-4699-b8f2-8a4ce2d46f75",
"b741e4de-2064-4de4-9f23-20b6c1d0dc1a",
"f4eebe51-a217-46d0-8733-b9499a092042"
]
It is recommended to describe the session using the session ID above to verify the desired selection was selected:
ncn-mw# cray bos session describe SESSION_ID --format toml
Example output:
status_link = "/v1/session/f4eebe51-a217-46d0-8733-b9499a092042/status"
complete = false
start_time = "2020-07-22 13:39:07.706774"
templateName = "cle-1.3.0"
error_count = 4
boa_job_name = "boa-f4eebe51-a217-46d0-8733-b9499a092042"
in_progress = false
operation = "reboot"
The status for the session will show the session ID, the boot sets in the session, the metadata, and some links. In the following example, there is only one boot set (named computes
), and the session ID being used is f4eebe51-a217-46d0-8733-b9499a092042
.
To display the status for the session:
ncn-mw# cray bos session status list SESSION_ID -–format json
Example output:
{
"boot_sets": [
"computes"
],
"id": "f4eebe51-a217-46d0-8733-b9499a092042",
"links": [
{
"href": "/v1/session/f4eebe51-a217-46d0-8733-b9499a092042/status",
"rel": "self"
},
{
"href": "/v1/session/f4eebe51-a217-46d0-8733-b9499a092042/status/computes" ,
"rel": "Boot Set"
}
],
"metadata": {
"in_progress": false,
"start_time": "2020-07-22 13:39:07.706774",
"complete": false,
"error_count": 4
}
}
Run the following command to view the status for a specific boot set in a session. For more information about retrieving the session ID and boot set name, see the View the status of a v1 session section above. Descriptions of the different status sections are described below.
id
parameter identifies which session this status belongs to.name
parameter is the name of the boot set.links
section displays links that enable administrators to drill down into each phase of the boot set.metadata
section for the boot set as a whole.name
parameter is the name of the phase.metadata
section for each phase.not_started
, in_progress
, succeeded
, failed
, and excluded
. The nodes are listed in the category they are currently occupying.ncn-mw# cray bos session status describe BOOT_SET_NAME SESSION_ID --format json
Example output:
{
"phases": [
{
"name": "shutdown",
"categories": [
{
"name": "not_started",
"node_list": []
},
{
"name": "succeeded",
"node_list": []
},
{
"name": "failed",
"node_list": [
"x3000c0s19b4n0",
"x3000c0s19b1n0",
"x3000c0s19b3n0",
"x3000c0s19b2n0"
]
},
{
"name": "excluded",
"node_list": []
},
{
"name": "in_progress",
"node_list": []
}
],
"metadata": {
"stop_time": "2020-07-22 13:53:19.842705",
"in_progress": false,
"start_time": "2020-07-22 13:39:08.276530",
"complete": true,
"error_count": 4
}
},
{
"name": "boot",
"categories": [
{
"name": "not_started",
"node_list": [
"x3000c0s19b4n0",
"x3000c0s19b3n0",
"x3000c0s19b1n0",
"x3000c0s19b2n0"
]
},
{
"name": "succeeded",
"node_list": []
},
{
"name": "failed",
"node_list": []
},
{
"name": "excluded",
"node_list": []
},
{
"name": "in_progress",
"node_list": []
}
],
"metadata": {
"in_progress": false,
"start_time": "2020-07-22 13:39:08.276542",
"complete": false,
"error_count": 0
}
},
{
"name": "configure",
"categories": [
{
"name": "not_started",
"node_list": [
"x3000c0s19b4n0",
"x3000c0s19b3n0",
"x3000c0s19b1n0",
"x3000c0s19b2n0"
]
},
{
"name": "succeeded",
"node_list": []
},
{
"name": "failed",
"node_list": []
},
{
"name": "excluded",
"node_list": []
},
{
"name": "in_progress",
"node_list": []
}
],
"metadata": {
"in_progress": false,
"start_time": "2020-07-22 13:39:08.276552",
"complete": false,
"error_count": 0
}
}
],
"session": "f4eebe51-a217-46d0-8733-b9499a092042",
"name": "computes",
"links": [
{
"href": "/v1/session/f4eebe51-a217-46d0-8733-b9499a092042/status/computes",
"rel": "self"
},
{
"href": "/v1/session/f4eebe51-a217-46d0-8733-b9499a092042/status/computes/shutdown",
"rel": "Phase"
},
{
"href": "/v1/session/f4eebe51-a217-46d0-8733-b9499a092042/status/computes/boot",
"rel": "Phase"
},
{
"href": "/v1/session/f4eebe51-a217-46d0-8733-b9499a092042/status/computes/configure",
"rel": "Phase"
}
],
"metadata": {
"in_progress": false,
"start_time": "2020-07-22 13:39:08.276519",
"complete": false,
"error_count": 4
}
}
Direct calls to the API are needed to retrieve the status for an individual phase. Support for the Cray CLI is not currently available.
The following command is used to view the status of a phase:
ncn-mw# curl -H "Authorization: Bearer BEARER_TOKEN" -X GET https://api-gw-service-nmn.local/apis/bos/v1/session/SESSION_ID/status/BOOT_SET_NAME/PHASE
In the following example, the session ID is f89eb554-c733-4197-b2f2-4e1e5ba0c0ec
, the boot set name is computes
, and the individual phase is shutdown
.
ncn-mw# curl -H "Authorization: Bearer BEARER_TOKEN" -X GET https://api-gw-service-nmn.local/apis/bos/v1/session/f89eb554-c733-4197-b2f2-4e1e5ba0c0ec/status/computes/shutdown
Example output:
{
"categories": [
{
"name": "not_started",
"node_list": []
},
{
"name": "succeeded",
"node_list": []
},
{
"name": "failed",
"node_list": []
},
{
"name": "excluded",
"node_list": []
},
{
"name": "in_progress",
"node_list": [
"x5000c1s2b0n1",
"x5000c1s0b0n0",
"x3000c0s19b4n0",
"x5000c1s0b1n0",
"x5000c1s0b1n1",
"x5000c1s1b1n1",
"x5000c1s2b0n0",
"x3000c0s19b3n0",
"x5000c1s0b0n1",
"x5000c1s2b1n1",
"x3000c0s19b1n0",
"x5000c1s1b1n0",
"x5000c1s2b1n0",
"x3000c0s19b2n0",
"x5000c1s1b0n1",
"x5000c1s1b0n0"
]
}
],
"metadata": {
"complete": false,
"error_count": 0,
"in_progress": true,
"start_time": "2020-06-30 21:42:39.355423"
},
"name": "shutdown"
}
Direct calls to the API are needed to retrieve the status for an individual category. Support for the Cray CLI is not currently available.
The following command is used to view the status of a phase:
ncn-mw# curl -H "Authorization: Bearer BEARER_TOKEN" -X GET https://api-gw-service-nmn.local/apis/bos/v1/session/SESSION_ID/status/BOOT_SET_NAME/PHASE/CATEGORY
In the following example, the session ID is f89eb554-c733-4197-b2f2-4e1e5ba0c0ec
, the boot set name is computes
, the phase is shutdown
, and the category is in_progress
.
ncn-mw# curl -H "Authorization: Bearer BEARER_TOKEN" -X GET https://api-gw-service-nmn.local/apis/bos/v1/session/f89eb554-c733-4197-b2f2-4e1e5ba0c0ec/status/computes/shutdown/in_progress
Example output:
{
"name": "in_progress",
"node_list": [
"x5000c1s2b0n1",
"x5000c1s0b0n0",
"x3000c0s19b4n0",
"x5000c1s0b1n0",
"x5000c1s0b1n1",
"x5000c1s1b1n1",
"x5000c1s2b0n0",
"x3000c0s19b3n0",
"x5000c1s0b0n1",
"x5000c1s2b1n1",
"x3000c0s19b1n0",
"x5000c1s1b1n0",
"x5000c1s2b1n0",
"x3000c0s19b2n0",
"x5000c1s1b0n1",
"x5000c1s1b0n0"
]
}