The Cray command line interface (CLI) tool is initialized and configured on the system. See “Configure the Cray Command Line Interface (CLI)” in the HPE Cray EX System Administration Guide S-8001 for more information.
IMPORTANT
It may be necessary to explicitly set the power state of certain devices before performing a given firmware update. Refer to ()[] in order to understand the process for setting power state and controlling relevant settings.
Use the Firmware Action Service (FAS) to execute an action. An action produces a set of firmware operations. Each operation represents an xname + target on that xname that will be targeted for update. There are two of firmware action modes: : dryrun
or liveupdate
; the parameters used when creating either are completely identical except for the overrideDryrun
setting. overrideDryrun
will determine if feature to determine what firmware can be updated on the system. Dry-runs are enabled by default, and can be configured with the overrideDryrun
parameter. A dry-run will create a query according to the filters requested by the admin. It will initiate an update sequence to determine what firmware is available, but will not actually change the state of the firmware
WARNING: It is crucial that an admin is familiar with the release notes of any firmware. The release notes will indicate what new features the firmware provides and if there are any incompatibilities. FAS does not know about incompatibilities or dependencies between versions. The admin assumes full responsibility for this knowledge.
WARNING: It is likely that when performing a firmware update, that the current version of firmware will not be available. This means that after successfully upgrading, the firmware cannot be reverted (i.e. downgraded to previous version).
This will cover the generic process for executing an action. For more specific examples and detailed explanations of options see the recipes.md
file.
Identify the selection of filters you want to apply. Filters narrow the scope of FAS to target-specific xnames, manufacturers, targets, etc. For our purpose we will run FAS ‘wide open’, with no selection filters applied.
create a JSON file {whole-system-dryrun.json}; to make this a live update
set "overrideDryrun": true
{ "command": {
"version": "latest",
"tag": "default",
"overrideDryrun": false,
"restoreNotPossibleOverride": true,
"timeLimit": 1000,
"description": "dryrun of full system" }
}
# cray fas actions create {whole-system-dryrun.json}
...
{
"actionID": "e0cdd7c2-32b1-4a25-9b2a-8e74217eafa7",
"overrideDryrun": false
}
Note the returned actionID
See ‘interpreting an Action’ for more information.
Firmware updates can be stopped if required. This is useful given only one action can be run at a time. This is to protect hardware from multiple actions trying to modify it at the same time.
IMPORTANT: If a Redfish update is already in progress, the abort will not stop that process on the device. It is likely the device will update. If the device needs to be manually power cycled (needManualReboot
), it is possible that the device will update, but not actually apply the update until its next reboot. Admins must verify the state of the system after an abort. Only perform an abort if truly necessary. The best way to check the state of the system is to do a snapshot or do a dry-run of an update.
# cray fas actions instance delete {actionID}
Note: The action could take up to a minute to fully abort.
There are several ways to get more information about a firmware update. An actionID
and operationID
s are generated when an live update or dry-run is created. These values can be used to learn more about what is happening on the system during an update.
For the steps below, the following returned messages will help determine if a firmware update is needed. The following are end state
s for operations
. The Firmware action
itself should be in completed
once all operations have finished.
NoOp
: Nothing to do, already at version.NoSol
: No viable image is available; this will not be updated.succeeded
:
dryrun
: The operation should succeed if performed as a live update
. succeeded
means that FAS identified that it COULD update an xname + target with the declared strategy.live update
: the operation succeeded, and has updated the xname + target to the identified version.failed
:
dryrun
: There is something that FAS could do, but it likely would fail; most likely because the file is missing.live update
: the operation failed, the identified version could not be put on the xname + target.Data can be viewed at several levels of information:
To view counts of operations, what state they are in, the overall state of the action, and what parameters were used to create the action:
# cray fas actions describe {actionID}
blockedBy = []
state = "completed"
actionID = "0a305f36-6d89-4cf8-b4a1-b9f199afaf3b" startTime = "2020-06-23 15:43:42.939100799 +0000 UTC"
snapshotID = "00000000-0000-0000-0000-000000000000"
endTime = "2020-06-23 15:48:59.586748151 +0000 UTC"
[actions.command]
description = "upgrade of x9000c1s3b1 Nodex.BIOS to WNC 1.1.2" tag = "default"
restoreNotPossibleOverride = true timeLimit = 1000
version = "latest" overrideDryrun = false [actions.operationCounts] noOperation = 0
succeeded = 2
verifying = 0
unknown = 0
configured = 0
initial = 0
failed = 0
noSolution = 0
aborted = 0
needsVerified = 0
total = 2
inProgress = 0
blocked = 0 [[actions]] blockedBy = [] state = "completed"
actionID = "0b9300d6-8f06-4019-a8fa-7b3ff65e5aa8" startTime = "2020-06-18 03:06:25.694573366 +0000 UTC"
snapshotID = "00000000-0000-0000-0000-000000000000"
endTime = "2020-06-18 03:11:06.806297546 +0000 UTC"
NOTE : unless the action’s state
is completed
or aborted
; then this action is still under progress.
# cray fas actions describe {actionID} --format json
{
"parameters": {
"stateComponentFilter": {
"deviceTypes": [
"routerBMC"
]
},
"command": {
"dryrun": false,
"description": "upgrade of routerBMCs for cray",
"tag": "default",
"restoreNotPossibleOverride": true,
"timeLimit": 1000,
"version": "latest"
},
"inventoryHardwareFilter": {
"manufacturer": "cray"
},
"imageFilter": {
"imageID": "00000000-0000-0000-0000-000000000000"
},
"targetFilter": {
"targets": [
"BMC"
]
}
},
"blockedBy": [],
"state": "completed",
"command": {
"dryrun": false,
"description": "upgrade of routerBMCs for cray",
"tag": "default",
"restoreNotPossibleOverride": true,
"timeLimit": 1000,
"version": "latest"
},
"actionID": "e0cdd7c2-32b1-4a25-9b2a-8e74217eafa7",
"startTime": "2020-06-26 20:03:37.316932354 +0000 UTC",
"snapshotID": "00000000-0000-0000-0000-000000000000",
"endTime": "2020-06-26 20:04:07.118243184 +0000 UTC",
"operationSummary": {
"succeeded": {
"OperationsKeys": []
},
"verifying": {
"OperationsKeys": []
},
"unknown": {
"OperationsKeys": []
},
"configured": {
"OperationsKeys": []
},
"initial": {
"OperationsKeys": []
},
"failed": {
"OperationsKeys": [
{
"stateHelper": "unexpected change detected in firmware version. Expected sc.1.4.35-prod- master.arm64.2020-06-26T08:36:42+00:00.0c2bb02 got: sc.1.3.307-prod-master.arm64.2020-06-13T00:28:26+00:00.f91edff",
"fromFirmwareVersion": "",
"xname": "x5000c1r7b0",
"target": "BMC",
"operationID": "0796eed0-e95d-45ea-bc71-8903d52cffde"
},
]
},
"noSolution": {
"OperationsKeys": []
},
"aborted": {
"OperationsKeys": []
},
"needsVerified": {
"OperationsKeys": []
},
"noOperation": {
"OperationsKeys": []
},
"inProgress": {
"OperationsKeys": []
},
"blocked": {
"OperationsKeys": []
}
}
}
Using the operationID
listed in the actions array we can see the full detail of the operation.
# cray fas operations describe operationID --format json
{
"fromFirmwareVersion": "", "fromTag": "",
"fromImageURL": "",
"endTime": "2020-06-24 14:23:37.544814197 +0000 UTC",
"actionID": "f48aabf1-1616-49ae-9761-a11edb38684d", "startTime": "2020-06-24 14:19:15.10128214 +0000 UTC",
"fromSemanticFirmwareVersion": "", "toImageURL": "",
"model": "WindomNodeCard_REV_D",
"operationID": "24a5e5fb-5c4f-4848-bf4e-b071719c1850", "fromImageID": "00000000-0000-0000-0000-000000000000",
"target": "BMC",
"toImageID": "71c41a74-ab84-45b2-95bd-677f763af168", "toSemanticFirmwareVersion": "",
"refreshTime": "2020-06-24 14:23:37.544824938 +0000 UTC",
"blockedBy": [],
"toTag": "",
"state": "succeeded",
"stateHelper": "unexpected change detected in firmware version. Expected nc.1.3.8-shasta-release.arm.2020-06-15T22:57:31+00:00.b7f0725 got: nc.1.2.25-shasta-release.arm.2020-05-15T17:27:16+00:00.0cf7f51",
"deviceType": "",
"expirationTime": "",
"manufacturer": "cray",
"xname": "x9000c1s3b1",
"toFirmwareVersion": ""
}
If an update fails due to "No Image available"
, it may be caused by FAS unable to match the data on the node to find an image in the image list.
Find the available image in FAS using the command: (change TARGETNAME to the actual target you are looking for)
cray fas images list --format json | jq '.[] | .[] | select(.target=="TARGETNAME")'
This command would display one or more images available for updates.
{
"imageID": "ff268e8a-8f73-414f-a9c7-737a34bb02fc",
"createTime": "2021-02-24T02:25:03Z",
"deviceType": "nodeBMC",
"manufacturer": "cray",
"models": [
"HPE Cray EX235n",
"GrizzlyPkNodeCard_REV_B"
],
"softwareIds": [
"fgpa:NVIDIA.HGX.A100.4.GPU:*:*"
],
"target": "Node0.AccFPGA0",
"tags": [
"default"
],
"firmwareVersion": "2.7",
"semanticFirmwareVersion": "2.7.0",
"pollingSpeedSeconds": 30,
"s3URL": "s3:/fw-update/80a62641764711ebabe28e2b78a05899/accfpga_nvidia_2.7.tar.gz"
}
If the firmwareVersion
from the FAS image matches the fromFirmwareVersion
from the FAS action, the firmware is at the latest version and no update is needed.
Using the imageID from the cray images list
command above (in the example above it would be: ff268e8a-8f73-414f-a9c7-737a34bb02fc
) add the following line to your action JSON file, replacing IMAGEID with the imageID:
"imageFilter": {
"imageID":"IMAGEID",
"overrideImage":true
}
Example actions JSON file with imageFilter added:
{
"stateComponentFilter": {
"deviceTypes":["nodeBMC"]
},
"inventoryHardwareFilter": {
"manufacturer":"cray"
},
"imageFilter": {
"imageID":"ff268e8a-8f73-414f-a9c7-737a34bb02fc",
"overrideImage":true
},
"targetFilter": {
"targets":["Node0.AccFPGA0","Node1.AccFPGA0"]
},
"command": {
"overrideDryrun":false,
"restoreNotPossibleOverride":true,
"overwriteSameImage":false
}
}
To be sure you grabbed the correct imageID, you can run the command:
cray fas images describe imageID
WARNING: FAS will force a flash of the device, using incorrect firmware may make it inoperable.
Rerun FAS actions command using the updated JSON file. It is strongly recommended you run a Dry Run (overrideDryrun=false) first and check the actions output.