Scroll down for code samples, example requests and responses. Select a language for code samples from the tabs above or the mobile navigation menu.
Many CAPMC v1 REST API and CLI features are being deprecated as part of CSM version 1.0; Full removal of the deprecated CAPMC features will happen in CSM version 1.3. Further development of CAPMC service or CLI has stopped. CAPMC has entered end-of-life but will still be generally available. CAPMC is going to be replaced with the Power Control Service (PCS) in a future release. The current API/CLI portfolio for CAPMC are being pruned to better align with the future direction of PCS. More information about PCS and the CAPMC transition will be released as part of subsequent CSM releases.
The API endpoints that remain un-deprecated will remain supported until their ‘phased transition’ into PCS (e.g. Power Capping is not ‘deprecated’ and will be supported in PCS; As PCS is developed, CAPMC’s Powercapping and PCS’s Powercapping will both function, eventually callers of the CAPMC power capping API/CLI will need to will need transition to call PCS as the API will be different.)
Here is a list of deprecated API (CLI) endpoints:
/get_node_rules
/get_node_status
/node_on
/node_off
/node_reinit
/node_status
/group_reinit
/get_group_status
/group_on
/group_off
/get_node_energy
/get_node_energy_stats
/get_node_energy_counter
/get_system_parameters
/get_system_power
/get_system_power_details
/emergency_power_off
/get_nid_map
The Cray Advanced Platform Monitoring and Control (CAPMC) API provides remote power monitoring and control to agents running externally to the Cray System Management Services.
These controls enable external software to manage the power state of an entire system and to more intelligently manage systemwide power consumption. The following API calls are provided as a means for third party software to implement power control and management strategies as simple or complex as the site-level requirements demand. The simplest power management strategy may be to simply turn off components which may be idle for a significant time interval and turn them back on when demand increases.
This implementation of CAPMC uses Redfish APIs to communicate directly with the hardware. Furthermore, all CAPMC power commands should be viewed as asynchronous and require the client to check for status after a CAPMC API call returns.
CAPMC component power controls implement a simple interface for powering off or on components (chassis, modules, nodes), and querying component state information using component identifiers known as xnames. These controls enable external software to more intelligently manage systemwide power consumption or configuration parameters.
CAPMC power capping controls implement a simple interface for querying component capabilities and manipulation of node or sub-node (accelerator) power constraints. This functionality enables external software to establish an upper bound, or estimate a minimum bound, on the amount of power a system or a select subset of the system may consume. The power capping API calls are provided as a means for third party software to implement advanced power management strategies.
Base URLs:
Code samples
POST https://api-gw-service-nmn.local/apis/capmc/capmc/v1/get_xname_status HTTP/1.1
Host: api-gw-service-nmn.local
Content-Type: application/json
Accept: application/json
# You can also use wget
curl -X POST https://api-gw-service-nmn.local/apis/capmc/capmc/v1/get_xname_status \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-H 'Authorization: Bearer {access-token}'
import requests
headers = {
'Content-Type': 'application/json',
'Accept': 'application/json',
'Authorization': 'Bearer {access-token}'
}
r = requests.post('https://api-gw-service-nmn.local/apis/capmc/capmc/v1/get_xname_status', headers = headers)
print(r.json())
package main
import (
"bytes"
"net/http"
)
func main() {
headers := map[string][]string{
"Content-Type": []string{"application/json"},
"Accept": []string{"application/json"},
"Authorization": []string{"Bearer {access-token}"},
}
data := bytes.NewBuffer([]byte{jsonReq})
req, err := http.NewRequest("POST", "https://api-gw-service-nmn.local/apis/capmc/capmc/v1/get_xname_status", data)
req.Header = headers
client := &http.Client{}
resp, err := client.Do(req)
// ...
}
POST /get_xname_status
Return component status
The get_xname_status
API returns component state for the full system or a subset of components as specified by a xname list and/or state filter. This status API is intended, but not limited, to be used in conjunction with operations which may modify component state, such as xname_on
or xname_off
.
By default, the status returned from this API are the hardware states as reported by Redfish (on or off). An optional source
parameter may be passed in the request body to report the states defined by the Cray Hardware Management System (HMS) software.
The get_xname_status
API does not report empty components.
Filters
Filters for a status query may be supplied as a pipe-separated (|) list surrounded with double quotes, e.g. “filter1|filter2|filter3”. Valid filters are: show_all
, show_off
, show_on
, show_halt
, show_standby
, show_ready
, and show_disabled
. Valid flag filters are show_alert
, show_resvd
, and show_warn
. Status and flag filters may be intermixed freely. The show_all
filter overrides any other status or flag filters and is the default.
Notice
The Hardware Management System no longer supports a “diag” state, as such the show_diag
filter is considered deprecated. Specifying only the show_diag
filter will not return any results.
Body parameter
{
"filter": "show_ready|show_standby",
"source": "hsm",
"xnames": [
"x0c0s1b0n0",
"x0c1s4b0n0",
"x0c1s6b0n0",
"x0c1s7b0n0",
"x0c0s1b0n1",
"x0c1s4b0n1",
"x0c1s6b0n1",
"x0c1s7b0n1"
]
}
Name | In | Type | Required | Description |
---|---|---|---|---|
body | body | object | true | A JSON object to get status for selected components. |
» filter | body | string | false | Optional, pipe concatenated ( |
» source | body | any | false | A string indicating the source for node status. Valid sources are HSM (or its aliases HMS , SM , SMD , or Software ) and Redfish (or its alias Hardware ). The default, when unspecified, is to use Redfish via the appropriate controller as the source for all status. The Hardware Management System (HMS) returns the largest set of possible status. A Redfish hardware source can only report off and on status for components. Source strings are normalized to all lower case so HSM or hsm are both valid. |
» xnames | body | [string] | false | User specified list of component IDs (xnames) to get the status of. An empty array indicates all components in the system. On systems with more than around 200 nodes, requests that specify an empty array may not succeed; in those cases, explicitly specifying the desired xnames should work. If invalid xnames are specified then an error will be returned. |
Example responses
200 Response
{
"e": 0,
"err_msg": "",
"ready": [
"x0c0s1b1n0",
"x0c1s4b1n0",
"x0c1s6b1n0",
"x0c1s7b1n0"
],
"standby": [
"x0c0s1b1n1",
"x0c1s4b1n1",
"x0c1s6b1n1",
"x0c1s7b1n1"
]
}
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | OK Network API call success | Inline |
400 | Bad Request | Bad Request | httpError400_BadRequest |
405 | Method Not Allowed | Method Not Allowed | httpError405_MethodNotAllowed |
500 | Internal Server Error | Internal Server Error | httpError500_InternalServerError |
Status Code 200
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
» e | integer(int32) | true | none | Request status code, zero on success, non-zero on error. |
» err_msg | string | true | none | Message indicating any error encountered. |
» on | [string] | false | none | Optional, list of powered on components by xname. |
» off | [string] | false | none | Optional, list of powered off components by xname. |
» disabled | [string] | false | none | Optional, list of disabled components by xname. The component is physically installed, but ignored by system management software. The Hardware Management System does not treat disabled as a separate state and as such disabled does not indicate the anything about the hardware or software state of a node. |
» ready | [string] | false | none | Optional, list of booted components by xname. Operating system is fully booted and sending heartbeats. |
» standby | [string] | false | none | Optional, list of components in standby by xname. Components that were previously booted and but are no longer sending heartbeat. |
To perform this operation, you must be authenticated by means of one of the following methods: bearerAuth
Code samples
POST https://api-gw-service-nmn.local/apis/capmc/capmc/v1/xname_reinit HTTP/1.1
Host: api-gw-service-nmn.local
Content-Type: application/json
Accept: application/json
# You can also use wget
curl -X POST https://api-gw-service-nmn.local/apis/capmc/capmc/v1/xname_reinit \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-H 'Authorization: Bearer {access-token}'
import requests
headers = {
'Content-Type': 'application/json',
'Accept': 'application/json',
'Authorization': 'Bearer {access-token}'
}
r = requests.post('https://api-gw-service-nmn.local/apis/capmc/capmc/v1/xname_reinit', headers = headers)
print(r.json())
package main
import (
"bytes"
"net/http"
)
func main() {
headers := map[string][]string{
"Content-Type": []string{"application/json"},
"Accept": []string{"application/json"},
"Authorization": []string{"Bearer {access-token}"},
}
data := bytes.NewBuffer([]byte{jsonReq})
req, err := http.NewRequest("POST", "https://api-gw-service-nmn.local/apis/capmc/capmc/v1/xname_reinit", data)
req.Header = headers
client := &http.Client{}
resp, err := client.Do(req)
// ...
}
POST /xname_reinit
Restart components
The xname_reinit
API issues a restart or off-on sequence to a selected list of components by xname. These power operations are ordered to allow large sets of components to be reinit’d with a single API call. Not all components support a restart, in these cases, the API may issue an off and then on to those components.
The xname_reinit
API will return after a power restart or off-on sequence is attempted to be sent to all of the selected components. The return payload should be examined as it may indicate components that did not receive the power request due to an error. The API may return immediately on an error containing a status result indicating any error encountered. The client must determine overall command status by calling the get_xname_status
API after this call returns.
xname_reinit
accepts all and s0 as valid xnames indicating all components in the system.
An optional text message may be provided describing the reason for performing the xname_reinit
operation.
Body parameter
{
"reason": "Changing kernel",
"xnames": [
"x0c0s1b0n0",
"x0c1s4b0n0",
"x0c1s6b0n0",
"x0c1rsb0n0"
],
"force": true
}
Name | In | Type | Required | Description |
---|---|---|---|---|
body | body | object | true | A JSON object to reinit selected components. |
» reason | body | string | false | Reason for doing a component reinit. |
» xnames | body | [string] | true | User specified list of component IDs (xnames) to reinit. An empty array is invalid. If invalid xnames are specified then an error will be returned. |
» force | body | boolean | false | Attempt to restart components disabling any checks for a graceful restart. |
Example responses
200 Response
{
"e": -1,
"err_msg": "",
"xnames": [
{
"e": -1,
"err_msg": "NodeBMC communication error",
"xname": "x0c0s1b0n0"
},
{
"e": -1,
"err_msg": "NodeBMC communication error",
"xname": "x0c1s6b0n0"
}
]
}
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | OK Network API call success | Inline |
400 | Bad Request | Bad Request | httpError400_BadRequest |
405 | Method Not Allowed | Method Not Allowed | httpError405_MethodNotAllowed |
500 | Internal Server Error | Internal Server Error | httpError500_InternalServerError |
Status Code 200
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
» e | integer(int32) | true | none | Request status code, zero on success, non-zero on error. |
» err_msg | string | true | none | Message indicating any error encountered. |
» xnames | [object] | false | none | none |
»» e | integer(int32) | true | none | Non-zero status code for failed request. |
»» err_msg | string | true | none | Message indicating any error encountered. |
»» xname | string | true | none | Component ID failing power restart attempt. |
To perform this operation, you must be authenticated by means of one of the following methods: bearerAuth
Code samples
POST https://api-gw-service-nmn.local/apis/capmc/capmc/v1/xname_on HTTP/1.1
Host: api-gw-service-nmn.local
Content-Type: application/json
Accept: application/json
# You can also use wget
curl -X POST https://api-gw-service-nmn.local/apis/capmc/capmc/v1/xname_on \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-H 'Authorization: Bearer {access-token}'
import requests
headers = {
'Content-Type': 'application/json',
'Accept': 'application/json',
'Authorization': 'Bearer {access-token}'
}
r = requests.post('https://api-gw-service-nmn.local/apis/capmc/capmc/v1/xname_on', headers = headers)
print(r.json())
package main
import (
"bytes"
"net/http"
)
func main() {
headers := map[string][]string{
"Content-Type": []string{"application/json"},
"Accept": []string{"application/json"},
"Authorization": []string{"Bearer {access-token}"},
}
data := bytes.NewBuffer([]byte{jsonReq})
req, err := http.NewRequest("POST", "https://api-gw-service-nmn.local/apis/capmc/capmc/v1/xname_on", data)
req.Header = headers
client := &http.Client{}
resp, err := client.Do(req)
// ...
}
POST /xname_on
Power on components
The xname_on
API powers on a selected list of components by xname. Power on operations are ordered to allow large sets of components to be powered on with a single API call.
The xname_on
API will return after a power on request is attempted to be sent to all of the selected components. The return payload should be examined as it may indicate components that did not receive the power on request due to an error. The API may return immediately on an error containing a status result indicating the error encountered. The client must determine overall command status by calling the get_xname_status
API after this call returns.
xname_on
accepts all and s0 as valid xnames indicating all components in the system.
An optional text message may be provided describing the reason for performing the xname_on
operation.
Body parameter
{
"reason": "Power on nodes to expand capacity",
"xnames": [
"x0c0s1b0n0",
"x0c1s4b0n0",
"x0c1s6b0n0",
"x0c1rsb0n0"
],
"force": true,
"recursive": true
}
Name | In | Type | Required | Description |
---|---|---|---|---|
body | body | object | true | A JSON object to power on selected components. |
» reason | body | string | false | Reason for turning components on. |
» xnames | body | [string] | true | User specified list of component IDs (xnames) to power on. An empty array is invalid. If invalid xnames are specified then an error will be returned. Available wildcards: all, s0. |
» force | body | boolean | false | Attempt to power on components disabling any checks for a graceful power on. |
» recursive | body | boolean | false | Attempt to power on the component hierarchy rooted by each component ID (xname). Incompatible with the prereq option. |
» prereq | body | boolean | false | Attempt to power on the component IDs(xnames) and all of their ancestors. Incompatible with the recursive option. |
» continue | body | boolean | false | Continue powering on valid component IDs (xnames) ignoring any component ID validation errors. Normally, a failure in validation ceases any attempt to power on any components. |
Example responses
200 Response
{
"e": -1,
"err_msg": "",
"xnames": [
{
"e": -1,
"err_msg": "NodeBMC communication error",
"xname": "x0c0s1b0n0"
},
{
"e": -1,
"err_msg": "NodeBMC communication error",
"xname": "x0c1s6b0n0"
}
]
}
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | OK Network API call success | Inline |
400 | Bad Request | Bad Request | httpError400_BadRequest |
405 | Method Not Allowed | Method Not Allowed | httpError405_MethodNotAllowed |
500 | Internal Server Error | Internal Server Error | httpError500_InternalServerError |
Status Code 200
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
» e | integer(int32) | true | none | Request status code, zero on success, non-zero on error. |
» err_msg | string | true | none | Message indicating any error encountered. |
» xnames | [object] | false | none | none |
»» e | integer(int32) | true | none | Non-zero status code for failed request. |
»» err_msg | string | true | none | Message indicating any error encountered. |
»» xname | string | true | none | Component ID failing power up attempt. |
To perform this operation, you must be authenticated by means of one of the following methods: bearerAuth
Code samples
POST https://api-gw-service-nmn.local/apis/capmc/capmc/v1/xname_off HTTP/1.1
Host: api-gw-service-nmn.local
Content-Type: application/json
Accept: application/json
# You can also use wget
curl -X POST https://api-gw-service-nmn.local/apis/capmc/capmc/v1/xname_off \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-H 'Authorization: Bearer {access-token}'
import requests
headers = {
'Content-Type': 'application/json',
'Accept': 'application/json',
'Authorization': 'Bearer {access-token}'
}
r = requests.post('https://api-gw-service-nmn.local/apis/capmc/capmc/v1/xname_off', headers = headers)
print(r.json())
package main
import (
"bytes"
"net/http"
)
func main() {
headers := map[string][]string{
"Content-Type": []string{"application/json"},
"Accept": []string{"application/json"},
"Authorization": []string{"Bearer {access-token}"},
}
data := bytes.NewBuffer([]byte{jsonReq})
req, err := http.NewRequest("POST", "https://api-gw-service-nmn.local/apis/capmc/capmc/v1/xname_off", data)
req.Header = headers
client := &http.Client{}
resp, err := client.Do(req)
// ...
}
POST /xname_off
Power off components
The xname_off
API will shutdown and power off a selected list of components by xname. Power off operations are ordered to allow large sets of components to be powered off with a single API call.
The xname_off
API will return after a power off request is attempted to be sent to all of the selected components. The return payload should be examined as it may indicate components that did not receive the power off request due to an error. The API may return immediately on an error containing a status result indicating the error encountered. The client must determine overall command status by calling the get_xname_status
API after this call returns.
xname_off
accepts as xnames all and s0 indicating all components in the system.
An optional text message may be provided describing the reason for performing the xname_off
operation.
Body parameter
{
"reason": "Power save, need less capacity",
"xnames": [
"x0c0s1b0n0",
"x0c1s4b0n0",
"x0c1s6b0n0",
"x0c1rsb0n0"
],
"force": true,
"recursive": true
}
Name | In | Type | Required | Description |
---|---|---|---|---|
body | body | object | true | A JSON object to power off selected components. |
» reason | body | string | false | Reason for turning components off. |
» xnames | body | [string] | true | User specified list of component IDs (xnames) to shutdown and power off. An empty array is invalid. If invalid xnames are specified then an error will be returned. Available wildcards: all, s0. |
» force | body | boolean | false | Attempt to power off components disabling any checks for a graceful power off. |
» recursive | body | boolean | false | Attempt to power off the component hierarchy rooted by each component ID (xname). Incompatible with the prereq option. |
» prereq | body | boolean | false | Attempt to power off the component and all of its ancestors by their component ID (xname). Incompatible with the recursive option. |
» continue | body | boolean | false | Continue powering on valid component IDs (xnames) ignoring any component ID validation errors. Normally, a failure in validation ceases any attempt to power on any components. |
Example responses
200 Response
{
"e": -1,
"err_msg": "",
"xnames": [
{
"e": -1,
"err_msg": "NodeBMC communication error",
"xname": "x0c0s1b0n0"
},
{
"e": -1,
"err_msg": "NodeBMC communication error",
"xname": "x0c1s6b0n0"
}
]
}
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | OK Network API call success | Inline |
400 | Bad Request | Bad Request | httpError400_BadRequest |
405 | Method Not Allowed | Method Not Allowed | httpError405_MethodNotAllowed |
500 | Internal Server Error | Internal Server Error | httpError500_InternalServerError |
Status Code 200
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
» e | integer(int32) | true | none | Request status code, zero on success, non-zero on error. |
» err_msg | string | true | none | Message indicating any error encountered. |
» xnames | [object] | false | none | none |
»» xname | string | true | none | Component ID failing power down attempt. |
»» e | integer(int32) | true | none | Non-zero status code for failed request. |
»» err_msg | string | true | none | Message indicating any error encountered. |
To perform this operation, you must be authenticated by means of one of the following methods: bearerAuth
Code samples
POST https://api-gw-service-nmn.local/apis/capmc/capmc/v1/get_power_cap HTTP/1.1
Host: api-gw-service-nmn.local
Content-Type: application/json
Accept: application/json
# You can also use wget
curl -X POST https://api-gw-service-nmn.local/apis/capmc/capmc/v1/get_power_cap \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-H 'Authorization: Bearer {access-token}'
import requests
headers = {
'Content-Type': 'application/json',
'Accept': 'application/json',
'Authorization': 'Bearer {access-token}'
}
r = requests.post('https://api-gw-service-nmn.local/apis/capmc/capmc/v1/get_power_cap', headers = headers)
print(r.json())
package main
import (
"bytes"
"net/http"
)
func main() {
headers := map[string][]string{
"Content-Type": []string{"application/json"},
"Accept": []string{"application/json"},
"Authorization": []string{"Bearer {access-token}"},
}
data := bytes.NewBuffer([]byte{jsonReq})
req, err := http.NewRequest("POST", "https://api-gw-service-nmn.local/apis/capmc/capmc/v1/get_power_cap", data)
req.Header = headers
client := &http.Client{}
resp, err := client.Do(req)
// ...
}
POST /get_power_cap
Return power capping controls
The get_power_cap
API returns the power capping control(s) and currently applied settings for the requested list of NIDs. Control values which are returned as zero indicates the respective control is unconstrained.
Body parameter
{
"nids": [
1,
40,
41,
42,
43
]
}
Name | In | Type | Required | Description |
---|---|---|---|---|
body | body | object | true | A JSON object to get power capping controls of selected NIDs. |
» nids | body | [integer] | true | User specified list, or empty array for all NIDs. This list must not contain invalid or duplicate NID numbers. If invalid NID numbers are specified then an error will be returned. If empty, the default is all NIDs. The specified NIDs must be in the ready state per the get_node_status command. |
Example responses
200 Response
{
"e": 0,
"err_msg": "",
"nids": [
{
"nid": 40,
"e": 0,
"err_msg": "",
"controls": [
{
"name": "node",
"val": 350
},
{
"name": "accel",
"val": 375
}
]
},
{
"nid": 42,
"e": 0,
"err_msg": "",
"controls": [
{
"name": "node",
"val": 325
},
{
"name": "accel",
"val": 350
}
]
}
]
}
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | OK Network API call success | Inline |
500 | Internal Server Error | Internal Server Error | httpError500_InternalServerError |
Status Code 200
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
» e | integer(int32) | true | none | Overall request status code, zero on total success, non-zero if one or more node specific operations fail. |
» err_msg | string | true | none | Message indicating any error encountered. |
» nids | [object] | true | none | Object array containing NID specific result data, each element represents a single NID. |
»» nid | integer(int32) | true | none | NID number owning the returned control objects. |
»» e | integer(int32) | false | none | Optional, error status, non-zero indicates operation failed on this node. |
»» err_msg | string | false | none | Optional, message indicating any error encountered. |
»» controls | [object] | false | none | Optional, array of node level control and status objects which have been queried, one element per control. |
»»» name | string | true | none | Unique control or status object identifier. |
»»» val | integer(int32) | true | none | Control object setting, or zero to indicate control is unconstrained, units are dependent upon control type. |
To perform this operation, you must be authenticated by means of one of the following methods: bearerAuth
Code samples
POST https://api-gw-service-nmn.local/apis/capmc/capmc/v1/get_power_cap_capabilities HTTP/1.1
Host: api-gw-service-nmn.local
Content-Type: application/json
Accept: application/json
# You can also use wget
curl -X POST https://api-gw-service-nmn.local/apis/capmc/capmc/v1/get_power_cap_capabilities \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-H 'Authorization: Bearer {access-token}'
import requests
headers = {
'Content-Type': 'application/json',
'Accept': 'application/json',
'Authorization': 'Bearer {access-token}'
}
r = requests.post('https://api-gw-service-nmn.local/apis/capmc/capmc/v1/get_power_cap_capabilities', headers = headers)
print(r.json())
package main
import (
"bytes"
"net/http"
)
func main() {
headers := map[string][]string{
"Content-Type": []string{"application/json"},
"Accept": []string{"application/json"},
"Authorization": []string{"Bearer {access-token}"},
}
data := bytes.NewBuffer([]byte{jsonReq})
req, err := http.NewRequest("POST", "https://api-gw-service-nmn.local/apis/capmc/capmc/v1/get_power_cap_capabilities", data)
req.Header = headers
client := &http.Client{}
resp, err := client.Do(req)
// ...
}
POST /get_power_cap_capabilities
Return power cap capabilities
The get_power_cap_capabilities
API returns information about installed hardware and its associated properties. Information returned includes the specific hardware types, NID membership, and power capping controls along with their allowable ranges. Information may be returned for a selected set of NIDs or the system as a whole.
Body parameter
{
"nids": [
40,
41,
42,
43
]
}
Name | In | Type | Required | Description |
---|---|---|---|---|
body | body | object | true | A JSON object to get power capping capabilities of selected NIDs. |
» nids | body | [integer] | true | User specified list, or empty array for all NIDs. This list must not contain invalid or duplicate NID numbers. If invalid NID numbers are specified, then an error will be returned. |
Example responses
200 Response
{
"e": 0,
"err_msg": "",
"groups": [
{
"name": "01:000d:306e:0082:000a:0020:3a34:8300",
"desc": "ComputeANC_IVB_130W_10c_32GB_14900_IntelKNCAccel",
"supply": 425,
"host_limit_min": 100,
"host_limit_max": 200,
"static": 0,
"powerup": 120,
"nids": [
40,
41
],
"controls": [
{
"name": "accel",
"desc": "Accelerator control",
"min": 220,
"max": 260
},
{
"name": "node",
"desc": "Node manager control",
"min": 320,
"max": 460
}
]
},
{
"name": "01:000d:306e:00e6:0014:0040:3a34:0000",
"desc": "ComputeANC_IVB_230W_20c_64GB_14900_NoAccel",
"supply": 425,
"host_limit_min": 200,
"host_limit_max": 350,
"static": 0,
"powerup": 150,
"nids": [
42,
43
],
"controls": [
{
"name": "node",
"desc": "Node manager control",
"min": 320,
"max": 460
}
]
}
]
}
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | OK Network API call success | Inline |
500 | Internal Server Error | Internal Server Error | httpError500_InternalServerError |
Status Code 200
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
» e | integer(int32) | true | none | Request status code, zero on success. |
» err_msg | string | true | none | Message indicating any error encountered. |
» groups | [object] | true | none | Object array containing hardware specific information and NID membership, each element represent a unique hardware type. |
»» name | string | true | none | Opaque identifier which Cray System Management Software uses to uniquely identify a node type. |
»» desc | string | true | none | Text description of the opaque node type identifier. |
»» host_limit_max | integer(int32) | true | none | Estimated maximum power, specified in watts, which host CPU(s) and memory may consume. |
»» host_limit_min | integer(int32) | true | none | Estimated minimum power, specified in watts, which host CPU(s) and memory require to operate. |
»» static | integer(int32) | true | none | Static per node power overhead, specified in watts, which is unreported. |
»» supply | integer(int32) | true | none | Maximum capacity of each node level power supply for the given hardware type, specified in watts. |
»» powerup | integer(int32) | true | none | Typical power consumption of each node during hardware initialization, specified in watts. |
»» nids | [integer] | true | none | NID members belonging to the given hardware type. |
»» controls | [object] | true | none | Array of node level control objects which may be assigned or queried, one element per control. |
»»» name | string | true | none | Unique control object identifier. |
»»» desc | string | true | none | Message indicating any error encountered. |
»»» min | integer(int32) | true | none | Minimum value which may be assigned to the control object, units are dependent upon control type. |
»»» max | integer(int32) | true | none | Maximum value which may be assigned to the control object, units are dependent upon control type. |
To perform this operation, you must be authenticated by means of one of the following methods: bearerAuth
Code samples
POST https://api-gw-service-nmn.local/apis/capmc/capmc/v1/set_power_cap HTTP/1.1
Host: api-gw-service-nmn.local
Content-Type: application/json
Accept: application/json
# You can also use wget
curl -X POST https://api-gw-service-nmn.local/apis/capmc/capmc/v1/set_power_cap \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-H 'Authorization: Bearer {access-token}'
import requests
headers = {
'Content-Type': 'application/json',
'Accept': 'application/json',
'Authorization': 'Bearer {access-token}'
}
r = requests.post('https://api-gw-service-nmn.local/apis/capmc/capmc/v1/set_power_cap', headers = headers)
print(r.json())
package main
import (
"bytes"
"net/http"
)
func main() {
headers := map[string][]string{
"Content-Type": []string{"application/json"},
"Accept": []string{"application/json"},
"Authorization": []string{"Bearer {access-token}"},
}
data := bytes.NewBuffer([]byte{jsonReq})
req, err := http.NewRequest("POST", "https://api-gw-service-nmn.local/apis/capmc/capmc/v1/set_power_cap", data)
req.Header = headers
client := &http.Client{}
resp, err := client.Do(req)
// ...
}
POST /set_power_cap
Set power capping parameters
The set_power_cap
API is used to establish an upper bound with respect to power consumption on a per-node, and if applicable, a sub-node basis. Established power cap parameters will revert to the default configuration on the next system boot.
Body parameter
{
"nids": [
{
"nid": 40,
"controls": [
{
"name": "node",
"val": 410
},
{
"name": "accel",
"val": 220
}
]
},
{
"nid": 42,
"controls": [
{
"name": "node",
"val": 400
}
]
}
]
}
Name | In | Type | Required | Description |
---|---|---|---|---|
body | body | object | true | A JSON object to set power capping parameters of selected NIDs. |
» nids | body | [object] | true | Object array containing NID specific input data, each element represents a single NID. |
»» nid | body | integer(int32) | true | NID to apply the specified power caps. The specified NID must be in the ready state per the get_node_status command. |
»» controls | body | [object] | true | Array of node level control objects to be adjusted, one element per control. |
»»» name | body | string | true | Specifies a node or accelerator as the type to apply a power cap to. |
»»» val | body | integer(int32) | true | Power cap value to assign to the selected component type. The value given must be within the range returned in the capabilities output. A Value of zero may be supplied to explicitly clear an existing power cap. |
»» controls: Array of node level control objects to be adjusted, one element per control.
Nodes with a high powered accelerators and high TDP processors will be automatically power capped at the supply limit returned per the get_power_cap_capabilities
command. If a node level power cap is specified that is within the node control range but exceeds the supply limit, the actual power cap assigned will be clamped at the supply limit.
The accelerator power cap value represents a subset of the total node level power cap. If a node level power cap of 400 watts is applied and an accelerator power cap of 180 watts is applied, then the total node power consumption is limited to 400 watts. If the accelerator is actively consuming its entire 180 watt power allocation, then the host processor, memory subsystem, and support logic for that node may consume a maximum of 220 watts.
Example responses
200 Response
{
"e": 0,
"err_msg": "",
"nids": [
{
"nid": 40,
"e": 0,
"err_msg": ""
},
{
"nid": 42,
"e": 0,
"err_msg": ""
}
]
}
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | OK Network API call success | Inline |
500 | Internal Server Error | Internal Server Error | httpError500_InternalServerError |
Status Code 200
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
» e | integer(int32) | true | none | Request status code, zero on success. |
» err_msg | string | true | none | Message indicating any error encountered. |
» nids | [object] | true | none | Object array containing NID specific error data, NIDs which experienced success are omitted. |
»» nid | integer(int32) | true | none | NID number owning the returned error data. |
»» e | integer(int32) | true | none | Error status, non-zero indicates operation failed on this node. |
»» err_msg | string | true | none | Message indicating any error encountered. |
To perform this operation, you must be authenticated by means of one of the following methods: bearerAuth
Code samples
GET https://api-gw-service-nmn.local/apis/capmc/capmc/v1/health HTTP/1.1
Host: api-gw-service-nmn.local
Accept: application/json
# You can also use wget
curl -X GET https://api-gw-service-nmn.local/apis/capmc/capmc/v1/health \
-H 'Accept: application/json' \
-H 'Authorization: Bearer {access-token}'
import requests
headers = {
'Accept': 'application/json',
'Authorization': 'Bearer {access-token}'
}
r = requests.get('https://api-gw-service-nmn.local/apis/capmc/capmc/v1/health', headers = headers)
print(r.json())
package main
import (
"bytes"
"net/http"
)
func main() {
headers := map[string][]string{
"Accept": []string{"application/json"},
"Authorization": []string{"Bearer {access-token}"},
}
data := bytes.NewBuffer([]byte{jsonReq})
req, err := http.NewRequest("GET", "https://api-gw-service-nmn.local/apis/capmc/capmc/v1/health", data)
req.Header = headers
client := &http.Client{}
resp, err := client.Do(req)
// ...
}
GET /health
Query the health of the service
The health
API returns health information about the CAPMC service and its dependencies. This actively checks the connection between CAPMC and the following:
Different portions of the CAPMC interface are dependent on combinations of the above services. If one or more of these services are unavailable, portions of the CAPMC interface will be unavailable. This is primarily intended as a diagnostic tool to investigate the functioning of the CAPMC service.
Example responses
200 Response
{
"readiness": "Service Degraded",
"vault": "No connection established to vault",
"hsm": "HSM Ready"
}
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | OK Network API call success | Inline |
405 | Method Not Allowed | Method Not Allowed | httpError405_MethodNotAllowed |
Status Code 200
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
» readiness | string | true | none | General state of the service - may be Ready, Service Degraded, or Not Ready. |
» vault | string | true | none | Description of the connection to the credentials vault. If there is an error returned when attempting to access the vault that will be included here. |
» hsm | string | true | none | Status of the connection to the Hardware State Manager (HSM). Any error reported by an attempt to access the HSM will be included in this description. |
To perform this operation, you must be authenticated by means of one of the following methods: bearerAuth
Code samples
GET https://api-gw-service-nmn.local/apis/capmc/capmc/v1/liveness HTTP/1.1
Host: api-gw-service-nmn.local
Accept: application/json
# You can also use wget
curl -X GET https://api-gw-service-nmn.local/apis/capmc/capmc/v1/liveness \
-H 'Accept: application/json' \
-H 'Authorization: Bearer {access-token}'
import requests
headers = {
'Accept': 'application/json',
'Authorization': 'Bearer {access-token}'
}
r = requests.get('https://api-gw-service-nmn.local/apis/capmc/capmc/v1/liveness', headers = headers)
print(r.json())
package main
import (
"bytes"
"net/http"
)
func main() {
headers := map[string][]string{
"Accept": []string{"application/json"},
"Authorization": []string{"Bearer {access-token}"},
}
data := bytes.NewBuffer([]byte{jsonReq})
req, err := http.NewRequest("GET", "https://api-gw-service-nmn.local/apis/capmc/capmc/v1/liveness", data)
req.Header = headers
client := &http.Client{}
resp, err := client.Do(req)
// ...
}
GET /liveness
Kubernetes liveness endpoint to monitor service health
The liveness
API works in conjunction with the Kubernetes liveness probe to determine when the service is no longer responding to requests. Too many failures of the liveness probe will result in the service being shut down and restarted.
This is primarily an endpoint for the automated Kubernetes system.
Example responses
405 Response
{
"e": 405,
"err_msg": "(PATCH) Not Allowed"
}
Status | Meaning | Description | Schema |
---|---|---|---|
204 | No Content | No Content Network API call success | None |
405 | Method Not Allowed | Method Not Allowed | httpError405_MethodNotAllowed |
To perform this operation, you must be authenticated by means of one of the following methods: bearerAuth
Code samples
GET https://api-gw-service-nmn.local/apis/capmc/capmc/v1/readiness HTTP/1.1
Host: api-gw-service-nmn.local
Accept: application/json
# You can also use wget
curl -X GET https://api-gw-service-nmn.local/apis/capmc/capmc/v1/readiness \
-H 'Accept: application/json' \
-H 'Authorization: Bearer {access-token}'
import requests
headers = {
'Accept': 'application/json',
'Authorization': 'Bearer {access-token}'
}
r = requests.get('https://api-gw-service-nmn.local/apis/capmc/capmc/v1/readiness', headers = headers)
print(r.json())
package main
import (
"bytes"
"net/http"
)
func main() {
headers := map[string][]string{
"Accept": []string{"application/json"},
"Authorization": []string{"Bearer {access-token}"},
}
data := bytes.NewBuffer([]byte{jsonReq})
req, err := http.NewRequest("GET", "https://api-gw-service-nmn.local/apis/capmc/capmc/v1/readiness", data)
req.Header = headers
client := &http.Client{}
resp, err := client.Do(req)
// ...
}
GET /readiness
Kubernetes readiness endpoint to monitor service health
The readiness
API works in conjunction with the Kubernetes readiness probe to determine when the service is no longer healthy and able to respond correctly to requests. Too many failures of the readiness probe will result in the traffic being routed away from this service and eventually the service will be shut down and restarted if in an unready state for too long.
This is primarily an endpoint for the automated Kubernetes system.
Example responses
405 Response
{
"e": 405,
"err_msg": "(PATCH) Not Allowed"
}
Status | Meaning | Description | Schema |
---|---|---|---|
204 | No Content | No Content Network API call success | None |
405 | Method Not Allowed | Method Not Allowed | httpError405_MethodNotAllowed |
To perform this operation, you must be authenticated by means of one of the following methods: bearerAuth
{
"e": 400,
"err_msg": "Bad Request: invalid URL escape"
}
CAPMC Bad Request error payload
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
e | integer(int32) | false | none | Error status code. |
err_msg | string | false | none | Message indicating any error encountered. |
{
"e": 405,
"err_msg": "(PATCH) Not Allowed"
}
CAPMC Method Not Allowed error payload
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
e | integer(int32) | false | none | Error status code. |
err_msg | string | false | none | Message indicating any error encountered. |
{
"e": 500,
"err_msg": "Connection to the secure store isn't ready. Cannot get Redfish credentials."
}
CAPMC Internal Server Error error payload
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
e | integer(int32) | false | none | Error status code. |
err_msg | string | false | none | Message indicating any error encountered. |