If a Redfish endpoint is in the HTTPsGetFailed
status, then the endpoint does not need to be fully rediscovered. The error indicates
an issue in the inventory process done by the Hardware State Manager (HSM). Restart the inventory process to fix this issue.
Update the HSM inventory to resolve issues with discovering Redfish endpoints.
The following is an example of the HSM error:
cray hsm inventory redfishEndpoints describe x3000c0s15b0
Example output:
Domain = ""
MACAddr = "b42e993b70ac"
Enabled = true
Hostname = "10.254.2.100"
RediscoverOnUpdate = true
FQDN = "10.254.2.100"
User = "root"
Password = ""
Type = "NodeBMC"
ID = "x3000c0s15b0"
[DiscoveryInfo]
LastDiscoveryAttempt = "2019-11-18T21:34:29.990441Z"
RedfishVersion = "1.1.0"
LastDiscoveryStatus = "HTTPsGetFailed"
Restart the HSM inventory process.
cray hsm inventory discover create --xnames XNAME
Verify that the Redfish endpoint has been rediscovered by HSM.
cray hsm inventory redfishEndpoints describe XNAME
Example output:
Domain = ""
MACAddr = "b42e993b70ac"
Enabled = true
Hostname = "10.254.2.100"
RediscoverOnUpdate = true
FQDN = "10.254.2.100"
User = "root"
Password = ""
Type = "NodeBMC"
ID = "x3000c0s15b0"
[DiscoveryInfo]
LastDiscoveryAttempt = "2019-11-18T21:34:29.990441Z"
RedfishVersion = "1.1.0"
LastDiscoveryStatus = "DiscoverOK"
Troubleshooting can stop if the discovery status is DiscoverOK
. Otherwise, proceed to the next step.
Re-enable the RedfishEndpoint
by setting both the Enabled
and RediscoverOnUpdate
fields to true
.
If Enabled = false
for the RedfishEndpoint
, then it may indicate a network or firmware issue with the BMC.
cray hsm inventory redfishEndpoints update BMC_XNAME \
--id BMC_XNAME --hostname BMC_XNAME \
--enabled true --rediscover-on-update true
Verify that the Redfish endpoint has been rediscovered by HSM.
Re-enabling the RedfishEndpoint
will cause a rediscovery to start. Troubleshooting can stop if the discovery status is DiscoverOK
.
cray hsm inventory redfishEndpoints describe BMC_XNAME
Troubleshooting: If discovery is still failing, then use the curl
command to manually contact the BMC using Redfish.
curl -k -u USERNAME:PASSWORD https://BMC_XNAME/redfish/v1/
If there is no response from /redfish/v1/
, then the BMC is either not powered or there is a network issue.
Contact other Redfish URLs on the BMC.
If the response is one of the following, then there is a firmware issue and a BMC restart or update may be needed:
An empty response:
curl -ku USERNAME:PASSWORD https://BMC_XNAME/redfish/v1/Systems/Node0 | jq .
Example output:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1330 100 1330 0 0 5708 0 --:--:-- --:--:-- --:--:-- 5708
{}
A garbled response:
curl -ku USERNAME:PASSWORD https://BMC_XNAME/redfish/v1/Managers/Self
Example output:
<pre style="font-size:12px; font-family:monospace; color:#8B0000;">[web.lua] Error in RequestHandler, thread: 0xb60670d8 is dead.
▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼▼
./redfish-handler.lua:0: attempt to index a nil value
stack traceback:
./turbo/httpserver.lua:251: in function <./turbo/httpserver.lua:212>
[C]: in function 'xpcall'
./turbo/iostream.lua:553: in function <./turbo/iostream.lua:544>
[C]: in function 'xpcall'
./turbo/ioloop.lua:573: in function <./turbo/ioloop.lua:572>
▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲</pre>
The URLs listed for the Systems do not include /Systems/
in the URL:
curl -ku USERNAME:PASSWORD https://BMC_XNAME/redfish/v1/Systems | jq .
Example output:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 421 100 421 0 0 1427 0 --:--:-- --:--:-- --:--:-- 1422
{
"@odata.context": "/redfish/v1/$metadata#ComputerSystemCollection.ComputerSystemCollection",
"@odata.etag": "W/\"1604517759\"",
"@odata.id": "/redfish/v1/Systems",
"@odata.type": "#ComputerSystemCollection.ComputerSystemCollection",
"Description": "Collection of Computer Systems",
"Members": [
{
"@odata.id": "/redfish/v1/Node0"
},
{
"@odata.id": "/redfish/v1/Node1"
}
],
"Members@odata.count": 2,
"Name": "Systems Collection"
}
The URL for a system is missing the SystemID
:
curl -ku USERNAME:PASSWORD https://BMC_XNAME/redfish/v1/Systems | jq .
Example output:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 421 100 421 0 0 1427 0 --:--:-- --:--:-- --:--:-- 1422
{
"@odata.context": "/redfish/v1/$metadata#ComputerSystemCollection.ComputerSystemCollection",
"@odata.etag": "W/\"1604517759\"",
"@odata.id": "/redfish/v1/Systems",
"@odata.type": "#ComputerSystemCollection.ComputerSystemCollection",
"Description": "Collection of Computer Systems",
"Members": [
{
"@odata.id": "/redfish/v1/Systems/"
}
],
"Members@odata.count": 1,
"Name": "Systems Collection"
}
Troubleshooting: If there was no indication of a firmware issue, then there may be an issue with Vault or the credentials stored in Vault.