If FAS has not yet been installed, firmware for NCNs can be updated manually without FAS. See Updating Firmware without FAS.
The Firmware Action Service (FAS) provides an interface for managing firmware versions of Redfish-enabled hardware in the system. FAS interacts with the Hardware State Managers (HSM), device data, and image data in order to update firmware.
Gigabyte nodes that are having issues with ipmitool
, Redfish, or flashing firmware may need to be reset to factory defaults.
See Set Gigabyte Node BMC to Factory Defaults.
FAS images contain the following information that is needed for a hardware device to update firmware versions:
firmwareVersion
it will report after it is successfully applied. See Artifact Management for more information.NEW: The FASUpdate.py script
can be used to perform default updates to firmware and BIOS.
Non-compute nodes (NCNs) and their BMCs should be locked with the HSM locking API to ensure they are not unintentionally updated by FAS. See Lock and Unlock Management Nodes for more information. Failure to lock the NCNs could result in unintentional update of the NCNs if FAS is not used correctly; this will lead to system instability problems.
NOTE: Any node that is locked remains in the state inProgress
with the stateHelper
message of "failed to lock"
until the action times out, or the lock is released.
If the action is timed out, these nodes report as failed
with the stateHelper
message of "time expired; could not complete update"
.
This includes NCNs which are manually locked to prevent accidental rebooting and firmware updates.
Follow the process outlined in FAS CLI to update the system. Use the recipes listed in FAS Use Cases to update each supported type.
NOTE
Each system is different and may not have all hardware options.
The following table describes the hardware items that can have their firmware updated via FAS. For more information about the upgradable targets, refer to the Firmware product stream repository.
Table 1. Upgradable Firmware Items
Manufacturer | Type | Target |
---|---|---|
Cray | nodeBMC |
BMC , Node[0-3].BIOS , Recovery , Node[0-1].AccFPGA0 , Node0.AccVBIOS , Node[0-3].ManagementEthernet , Node0.AccUC |
Cray | chassisBMC |
BMC , Recovery |
Cray | routerBMC |
BMC , Recovery |
Gigabyte | nodeBMC |
BMC , BIOS |
HPE | nodeBMC |
iLO 5 , iLO 6 , System ROM ,Redundant System ROM |
Foxconn | nodeBMC |
bmc_active , bios_active , erot_active ,fpga_active , pld_active |
For each item in the Hardware Precedence Order
:
Complete a dry-run:
cray fas actions create {jsonfile}
NEW: The FASUpdate.py script
can be used to perform default updates to firmware and BIOS.ActionID
.state
is completed
:
cray fas actions describe {actionID} --format json
Interpret the outcome of the dry-run.
Look at the counts and determine if the dry-run identified any hardware to update.
For the steps below, the following returned messages will help determine if a firmware update is needed. The following are end state
s for operations
.
The firmware action
itself should be in completed
once all operations have finished.
NoOp
: Nothing to do; already at the requested version.NoSol
: No viable image is available; this will not be updated.succeeded
:
dryrun
: The operation should succeed if performed as a live update
. succeeded
means that FAS identified that it COULD update a component name (xname) and target with the declared strategy.live update
: The operation succeeded and has updated the component name (xname) and target to the identified version.failed
:
dryrun
: There is something that FAS could do, but it likely would fail (most likely because the file is missing).live update
: The operation failed. The identified version could not be put on the component name (xname) and target.If succeeded
count is greater than zero, then perform a live update.
overrideDryrun
field to true
.cray fas actions create {jsonfile}
NEW: The FASUpdate.py script
can be used to perform default updates to firmware and BIOS.
ActionID
!state
is completed
:
cray fas actions describe {actionID} --format json
Interpret the outcome of the live update; proceed to next type of hardware.
After identifying which hardware is in the system, start with the top item on this list to update. If any of the following hardware is not in the system, then skip it.
IMPORTANT: This process does not communicate the SAFE way to update NCNs. If the NCNs and their BMCs have not been locked, or if FAS is blindly used to update NCNs without following the correct process, then THE STABILITY OF THE SYSTEM WILL BE JEOPARDIZED. Read the corresponding recipes before updating. There are sometimes ancillary actions that must be completed in order to ensure update integrity.
NOTE
To update Switch Controllers (sC) orRouterBMC
, refer to the Rosetta Documentation.
There are several use cases for using the FAS to update firmware on the system. These use cases are intended to be run by system administrators with a good understanding of firmware. Under no circumstances should non-administrator users attempt to use FAS or perform a firmware update.
imageID
to the record.imageIDs
to put the component name (xname)/targets back to the firmware version they were at, at the time of the snapshot.An action is collection of operations, which are individual firmware update tasks. Only one FAS action can be run at a time. Any other attempted action will be queued. Additionally, only one operation can be run on a component name (xname) at a time. For example, if there are 1000 xnames with 5 targets each to be updated, all 1000 xnames can be updating a target, but only 1 target on each xname will be updated at a time.
The life cycle of any action can be divided into the static and dynamic portions of the life cycle.
The static portion of the life cycle is where the action is created and configured. It begins with a request to create an action through either of the following requests:
/actions
API./snapshots
API.The dynamic portion of the life cycle is where the action is executed to completion. It begins when the actions is transitioned from the new
to configured
state. The action will then be ultimately transitioned to an end state of aborted
or completed
.
Operations are individual tasks in a FAS action.
FAS will create operations based on the configuration sent through the actions create
command.
FAS operations will have one of the following states:
initial
- Operation just created.configured
- The operation is configured, but nothing has been started.blocked
- Only one operation can be performed on a node at a time. If more than one update is required for a component name (xname), then operations will be blocked. This will have a message of blocked by sibling
.inProgress
- Update is in progress, but not completed.verifying
- Waiting for update to complete.failed
- An update was attempted, but FAS is unable to tell that the update succeeded in the allotted time.noOperation
- Firmware is at the correct version according to the images loaded into FAS.noSolution
- FAS does not have a suitable image for an update.aborted
- The operation was aborted before it could determine if it was successful. If aborted after the update command was sent to the node, then the node may still have updated.FAS requires images in order to update firmware for any device on the system. An image contains the data that allows FAS to establish a link between an administrative command, available devices (xname/targets), and available firmware.
The following is an example of an image:
{
"imageID": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"createTime": "2020-05-11T17:11:07.017Z",
"deviceType": "nodeBMC",
"manufacturer": "intel",
"model": ["s2600","s2600_REV_a"],
"target": "BIOS",
"tag": ["recovery", "default"],
"firmwareVersion": "f1.123.24xz",
"semanticFirmwareVersion": "v1.2.252",
"updateURI": "/redfish/v1/Systems/UpdateService/BIOS",
"needManualReboot": true,
"waitTimeBeforeManualRebootSeconds": 600,
"waitTimeAfterRebootSeconds": 180,
"pollingSpeedSeconds": 30,
"forceResetType": "ForceRestart",
"s3URL": "s3://firmware/f1.1123.24.xz.iso",
"allowableDeviceStates": [
"On",
"Off"
]
}
The main components of an image are described in the following sections.
This includes the deviceType
, manufacturer
, model
, target
, tag
, semanticFirmwareVersion
(firmware version) fields.
These fields are how administrators assess what firmware is on a device, and if an image is applicable to that device.
This includes the forceResetType
, pollingSpeedSeconds
, waitTime(s)
, allowableDeviceStates
fields.
FAS gets information about how to update the firmware from these fields. These values determine if FAS is responsible for rebooting the device, and what communication pattern to use.
s3URL
The URL that FAS uses to get the firmware binary and the download link that is supplied to Redfish devices. Redfish devices are not able to directly communicate with S3.