HSM manages important information for hardware components in the system. Administrators can use the data returned by HSM to learn about the state of the system. To do so, it is critical that the State and Flag fields are understood, and the next steps to take are known when viewing output returned by HSM commands. It is also beneficial to understand what services can cause State or Flag changes in HSM.
The following describes what causes State and Flag changes for all HSM components:
Off/OK or On/OK. BMCs go to Ready/OK instead of On/OK.Populated state after discovery has an unknown power state. If one is expected for nodes, BMCs, or another component, this is likely due to a firmware issue.Warning or Alert if the component’s Status.Health reads as Warning or Critical via Redfish during discovery.Empty if that BMC is removed from the network.hmcollector for HSM’s consumption. HSM will update component state based on the information in the Redfish events.The following describes what causes State and Flag changes for nodes only:
Ready/OK when it starts heartbeats.Ready/Warning after a few missed heartbeats.Standby after many missed heartbeats and the node is presumed dead.State descriptions:
Empty
The location is not populated with a component.
Populated
Present (not empty), but no further track can or is being done.
Off
Present but powered off.
On
Powered on. If no heartbeat mechanism is available, its software state may be unknown.
Standby
No longer Ready and presumed dead. It typically means the heartbeat has been lost (w/ alert).
Ready
Both On and Ready to provide its expected services. For example, used for jobs.
Flag descriptions:
OK
Component is OK.
Warning
There is a non-critical error. Generally coupled with a Ready state.
Alert
There is a critical error. Generally coupled with a Standby state. Otherwise, reported via Redfish.
The following table describes how to interpret when the state of hardware changes:
| Prior State | New State | Reason |
|---|---|---|
| Ready | Standby | HBTD if node has many missed heartbeats |
| Ready | Ready/Warning | HBTD if node has a few missed heartbeats |
| Standby | Ready | HBTD node re-starts heartbeating |
| On | Ready | HBTD node started heartbeating |
| Off | Ready | HBTD sees heartbeats before Redfish Event (On) |
| Standby | On | Redfish Event (On) or if re-discovered while in the standby state |
| Off | On | Redfish Event (On) |
| Standby | Off | Redfish Event (Off) |
| Ready | Off | Redfish Event (Off) |
| On | Off | Redfish Event (Off) |
| Any State | Empty | Redfish Endpoint is disabled meaning component removal |
Generally, nodes transition from Off to On to Ready when going from Off to booted, and from Ready to Ready/Warning to Standby to Off when shut down.