Scroll down for code samples, example requests and responses. Select a language for code samples from the tabs above or the mobile navigation menu.
The Rack Resiliency Service (RRS) provides a set of APIs to manage Rack Level resiliency. It queries the Kubernetes cluster to provide aggregated zone information and detailed critical service status. It gathers node details across various zones and presents both high-level summaries and in-depth information for zones and critical services.
Retrieve aggregated zone configuration showing Kubernetes_Topology_Zones and CEPH_Zones including:
- Management_Master_Nodes
- Management_Worker_Nodes
- Management_Storage_Nodes
Alternatively, if zones are not configured, one of the following informational messages is returned:
- "No K8s Topology/Ceph Zones configured"
- "No Ceph zones configured"
- "No K8s topology zones configured"
Retrieve detailed information for a specific zone including:
- Zone_Name
- Management_Master_Nodes
- Management_Worker_Nodes
- Management_Storage_Nodes
- Node status and OSD information
Retrieve a list of critical services grouped by namespace
Retrieve a summarized view of a specific critical service (without pod details). The response includes:
- configured_instances
- name
- namespace
- type
Update the critical services configuration based on provided input. This endpoint
modifies which critical services are monitored.
Retrieve the status of all critical services including service status and distribution details.
Each service object may include:
- Service Name
- status
- balanced: indicates whether the service is properly distributed
Retrieve detailed status for a specific critical service including pod information.
The response includes:
- configured_instances
- currently_running_instances
- name
- namespace
- pods with name, node, status, and zone
- type
- status
- balanced
Base URLs:
License: Hewlett Packard Enterprise Development LP
Retrieve aggregated and detailed information about zones including node types and counts.
Code samples
GET https://api-gw-service-nmn.local/apis/rrs/zones HTTP/1.1
Host: api-gw-service-nmn.local
Accept: application/json
# You can also use wget
curl -X GET https://api-gw-service-nmn.local/apis/rrs/zones \
-H 'Accept: application/json' \
-H 'Authorization: Bearer {access-token}'
import requests
headers = {
'Accept': 'application/json',
'Authorization': 'Bearer {access-token}'
}
r = requests.get('https://api-gw-service-nmn.local/apis/rrs/zones', headers = headers)
print(r.json())
package main
import (
"bytes"
"net/http"
)
func main() {
headers := map[string][]string{
"Accept": []string{"application/json"},
"Authorization": []string{"Bearer {access-token}"},
}
data := bytes.NewBuffer([]byte{jsonReq})
req, err := http.NewRequest("GET", "https://api-gw-service-nmn.local/apis/rrs/zones", data)
req.Header = headers
client := &http.Client{}
resp, err := client.Do(req)
// ...
}
GET /zones
Get Zones Configuration
Returns an object with a property “Zones” that is an array of zones. Each zone contains:
Example responses
Zones configuration
{
"Zones": [
{
"Zone_Name": "cscs-rack-x3001",
"Kubernetes_Topology_Zone": {
"Management_Master_Nodes": [
"ncn-m002"
],
"Management_Worker_Nodes": [
"ncn-w002",
"ncn-w004"
]
},
"CEPH_Zone": {
"Management_Storage_Nodes": [
"ncn-s004",
"ncn-s003"
]
}
},
{
"Zone_Name": "cscs-rack-x3002",
"Kubernetes_Topology_Zone": {
"Management_Master_Nodes": [
"ncn-m003"
],
"Management_Worker_Nodes": [
"ncn-w003"
]
},
"CEPH_Zone": {
"Management_Storage_Nodes": [
"ncn-s005",
"ncn-s002"
]
}
},
{
"Zone_Name": "cscs-rack-x3000",
"Kubernetes_Topology_Zone": {
"Management_Master_Nodes": [
"ncn-m001"
],
"Management_Worker_Nodes": [
"ncn-w001",
"ncn-w005"
]
},
"CEPH_Zone": {
"Management_Storage_Nodes": [
"ncn-s001"
]
}
}
]
}
404 Response
{
"detail": "string",
"errors": {},
"instance": "http://example.com",
"status": 400,
"title": "string",
"type": "about:blank"
}
| Status | Meaning | Description | Schema |
|---|---|---|---|
| 200 | OK | Zones configuration | ZonesResponse |
| 404 | Not Found | Not found | ProblemDetails |
| 500 | Internal Server Error | Internal server error | ProblemDetails |
To perform this operation, you must be authenticated by means of one of the following methods: bearerAuth
Code samples
GET https://api-gw-service-nmn.local/apis/rrs/zones/{zone_name} HTTP/1.1
Host: api-gw-service-nmn.local
Accept: application/json
# You can also use wget
curl -X GET https://api-gw-service-nmn.local/apis/rrs/zones/{zone_name} \
-H 'Accept: application/json' \
-H 'Authorization: Bearer {access-token}'
import requests
headers = {
'Accept': 'application/json',
'Authorization': 'Bearer {access-token}'
}
r = requests.get('https://api-gw-service-nmn.local/apis/rrs/zones/{zone_name}', headers = headers)
print(r.json())
package main
import (
"bytes"
"net/http"
)
func main() {
headers := map[string][]string{
"Accept": []string{"application/json"},
"Authorization": []string{"Bearer {access-token}"},
}
data := bytes.NewBuffer([]byte{jsonReq})
req, err := http.NewRequest("GET", "https://api-gw-service-nmn.local/apis/rrs/zones/{zone_name}", data)
req.Header = headers
client := &http.Client{}
resp, err := client.Do(req)
// ...
}
GET /zones/{zone_name}
Get Detailed Zone Information
Returns detailed information for a specific zone. The response includes:
| Name | In | Type | Required | Description |
|---|---|---|---|---|
| zone_name | path | ZoneName | true | The name of the zone |
Example responses
Detailed zone information
{
"Zone_Name": "cscs-rack-x3001",
"Management_Master": {
"Count": 1,
"Type": "Kubernetes_Topology_Zone",
"Nodes": [
{
"name": "ncn-m002",
"status": "Ready"
}
]
},
"Management_Worker": {
"Count": 2,
"Type": "Kubernetes_Topology_Zone",
"Nodes": [
{
"name": "ncn-w002",
"status": "Ready"
},
{
"name": "ncn-w004",
"status": "Ready"
}
]
},
"Management_Storage": {
"Count": 2,
"Type": "CEPH_Zone",
"Nodes": [
{
"name": "ncn-s004",
"status": "NotReady",
"osds": {
"down": [
"osd.0",
"osd.5"
]
}
},
{
"name": "ncn-s003",
"status": "Ready",
"osds": {
"up": [
"osd.4",
"osd.9",
"osd.12",
"osd.15",
"osd.18",
"osd.21",
"osd.24",
"osd.27"
]
}
}
]
}
}
400 Response
{
"detail": "string",
"errors": {},
"instance": "http://example.com",
"status": 400,
"title": "string",
"type": "about:blank"
}
| Status | Meaning | Description | Schema |
|---|---|---|---|
| 200 | OK | Detailed zone information | ZoneDetailResponse |
| 400 | Bad Request | Bad request | ProblemDetails |
| 404 | Not Found | Not found | ProblemDetails |
| 500 | Internal Server Error | Internal server error | ProblemDetails |
To perform this operation, you must be authenticated by means of one of the following methods: bearerAuth
Interact with critical service configurations, summaries, and runtime statuses.
Code samples
GET https://api-gw-service-nmn.local/apis/rrs/criticalservices HTTP/1.1
Host: api-gw-service-nmn.local
Accept: application/json
# You can also use wget
curl -X GET https://api-gw-service-nmn.local/apis/rrs/criticalservices \
-H 'Accept: application/json' \
-H 'Authorization: Bearer {access-token}'
import requests
headers = {
'Accept': 'application/json',
'Authorization': 'Bearer {access-token}'
}
r = requests.get('https://api-gw-service-nmn.local/apis/rrs/criticalservices', headers = headers)
print(r.json())
package main
import (
"bytes"
"net/http"
)
func main() {
headers := map[string][]string{
"Accept": []string{"application/json"},
"Authorization": []string{"Bearer {access-token}"},
}
data := bytes.NewBuffer([]byte{jsonReq})
req, err := http.NewRequest("GET", "https://api-gw-service-nmn.local/apis/rrs/criticalservices", data)
req.Header = headers
client := &http.Client{}
resp, err := client.Do(req)
// ...
}
GET /criticalservices
Get Critical Services
Returns a list of critical services grouped by namespace. The response includes a critical_services property containing namespaces with arrays of service objects including:
Example responses
List of critical services grouped by namespace
{
"critical_services": {
"namespace": {
"services": [
{
"name": "cray-dns-powerdns",
"type": "Deployment"
},
{
"name": "cray-hbtd",
"type": "Deployment"
},
{
"name": "cray-hmnfd",
"type": "Deployment"
},
{
"name": "cray-keycloak",
"type": "StatefulSet"
},
{
"name": "cray-sls-postgres",
"type": "StatefulSet"
}
],
"spire": [
{
"name": "cray-spire-server",
"type": "StatefulSet"
}
],
"rack-resiliency": [
{
"name": "k8s-zone-api",
"type": "Deployment"
}
]
}
}
}
404 Response
{
"detail": "string",
"errors": {},
"instance": "http://example.com",
"status": 400,
"title": "string",
"type": "about:blank"
}
| Status | Meaning | Description | Schema |
|---|---|---|---|
| 200 | OK | List of critical services grouped by namespace | CriticalServicesListSchema |
| 404 | Not Found | Not found | ProblemDetails |
| 500 | Internal Server Error | Internal server error | ProblemDetails |
To perform this operation, you must be authenticated by means of one of the following methods: bearerAuth
Code samples
PATCH https://api-gw-service-nmn.local/apis/rrs/criticalservices HTTP/1.1
Host: api-gw-service-nmn.local
Content-Type: application/json
Accept: application/json
# You can also use wget
curl -X PATCH https://api-gw-service-nmn.local/apis/rrs/criticalservices \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-H 'Authorization: Bearer {access-token}'
import requests
headers = {
'Content-Type': 'application/json',
'Accept': 'application/json',
'Authorization': 'Bearer {access-token}'
}
r = requests.patch('https://api-gw-service-nmn.local/apis/rrs/criticalservices', headers = headers)
print(r.json())
package main
import (
"bytes"
"net/http"
)
func main() {
headers := map[string][]string{
"Content-Type": []string{"application/json"},
"Accept": []string{"application/json"},
"Authorization": []string{"Bearer {access-token}"},
}
data := bytes.NewBuffer([]byte{jsonReq})
req, err := http.NewRequest("PATCH", "https://api-gw-service-nmn.local/apis/rrs/criticalservices", data)
req.Header = headers
client := &http.Client{}
resp, err := client.Do(req)
// ...
}
PATCH /criticalservices
Update Critical Services ConfigMap
Updates the critical services configuration. The request body should contain critical services mapped by service name to their configuration details.
Body parameter
{
"critical_services": {
"property1": {
"namespace": "rack-resiliency",
"type": "Deployment"
},
"property2": {
"namespace": "rack-resiliency",
"type": "Deployment"
}
}
}
| Name | In | Type | Required | Description |
|---|---|---|---|---|
| body | body | CriticalServiceCmStaticType | true | Critical services configuration update |
Example responses
Critical Services Updated Successfully
{
"Update": "Successful",
"Successfully_Added_Services": [
"k8s-zone-api",
"kube-multus-ds"
],
"Already_Existing_Services": [
"coredns",
"kube-proxy"
]
}
400 Response
{
"detail": "string",
"errors": {},
"instance": "http://example.com",
"status": 400,
"title": "string",
"type": "about:blank"
}
| Status | Meaning | Description | Schema |
|---|---|---|---|
| 200 | OK | Critical Services Updated Successfully | CriticalServiceUpdateSchema |
| 400 | Bad Request | Bad request | ProblemDetails |
| 404 | Not Found | Not found | ProblemDetails |
| 500 | Internal Server Error | Internal server error | ProblemDetails |
To perform this operation, you must be authenticated by means of one of the following methods: bearerAuth
Code samples
GET https://api-gw-service-nmn.local/apis/rrs/criticalservices/{critical_service_name} HTTP/1.1
Host: api-gw-service-nmn.local
Accept: application/json
# You can also use wget
curl -X GET https://api-gw-service-nmn.local/apis/rrs/criticalservices/{critical_service_name} \
-H 'Accept: application/json' \
-H 'Authorization: Bearer {access-token}'
import requests
headers = {
'Accept': 'application/json',
'Authorization': 'Bearer {access-token}'
}
r = requests.get('https://api-gw-service-nmn.local/apis/rrs/criticalservices/{critical_service_name}', headers = headers)
print(r.json())
package main
import (
"bytes"
"net/http"
)
func main() {
headers := map[string][]string{
"Accept": []string{"application/json"},
"Authorization": []string{"Bearer {access-token}"},
}
data := bytes.NewBuffer([]byte{jsonReq})
req, err := http.NewRequest("GET", "https://api-gw-service-nmn.local/apis/rrs/criticalservices/{critical_service_name}", data)
req.Header = headers
client := &http.Client{}
resp, err := client.Do(req)
// ...
}
GET /criticalservices/{critical_service_name}
Get Critical Service Details (Summarized)
Returns a summarized view of a specific critical service. The response includes:
| Name | In | Type | Required | Description |
|---|---|---|---|---|
| critical_service_name | path | ServiceName | true | The name of the critical service |
Example responses
Summarized critical service information
{
"critical_service": {
"name": "cray-hbtd",
"namespace": "services",
"type": "Deployment",
"configured_instances": 3
}
}
400 Response
{
"detail": "string",
"errors": {},
"instance": "http://example.com",
"status": 400,
"title": "string",
"type": "about:blank"
}
| Status | Meaning | Description | Schema |
|---|---|---|---|
| 200 | OK | Summarized critical service information | CriticalServiceDescribeSchema |
| 400 | Bad Request | Bad request | ProblemDetails |
| 404 | Not Found | Not found | ProblemDetails |
| 500 | Internal Server Error | Internal server error | ProblemDetails |
To perform this operation, you must be authenticated by means of one of the following methods: bearerAuth
Code samples
GET https://api-gw-service-nmn.local/apis/rrs/criticalservices/status HTTP/1.1
Host: api-gw-service-nmn.local
Accept: application/json
# You can also use wget
curl -X GET https://api-gw-service-nmn.local/apis/rrs/criticalservices/status \
-H 'Accept: application/json' \
-H 'Authorization: Bearer {access-token}'
import requests
headers = {
'Accept': 'application/json',
'Authorization': 'Bearer {access-token}'
}
r = requests.get('https://api-gw-service-nmn.local/apis/rrs/criticalservices/status', headers = headers)
print(r.json())
package main
import (
"bytes"
"net/http"
)
func main() {
headers := map[string][]string{
"Accept": []string{"application/json"},
"Authorization": []string{"Bearer {access-token}"},
}
data := bytes.NewBuffer([]byte{jsonReq})
req, err := http.NewRequest("GET", "https://api-gw-service-nmn.local/apis/rrs/criticalservices/status", data)
req.Header = headers
client := &http.Client{}
resp, err := client.Do(req)
// ...
}
GET /criticalservices/status
Get Critical Services Status
Returns the status of all critical services with distribution details. Response provides namespaces containing services with their status information including:
Example responses
Critical services status retrieved successfully
{
"critical_services": {
"namespace": {
"services": [
{
"name": "cray-dns-powerdns",
"type": "Deployment",
"status": "Configured",
"balanced": "true"
},
{
"name": "cray-hbtd",
"type": "Deployment",
"status": "Configured",
"balanced": "true"
},
{
"name": "cray-hmnfd",
"type": "Deployment",
"status": "Configured",
"balanced": "true"
},
{
"name": "cray-keycloak",
"type": "StatefulSet",
"status": "Configured",
"balanced": "true"
},
{
"name": "cray-sls-postgres",
"type": "StatefulSet",
"status": "PartiallyConfigured",
"balanced": "true"
}
],
"spire": [
{
"name": "cray-spire-server",
"type": "StatefulSet",
"status": "Configured",
"balanced": "true"
}
],
"kube-system": [
{
"name": "coredns",
"type": "Deployment",
"status": "Configured",
"balanced": "true"
}
],
"rack-resiliency": [
{
"name": "k8s-zone-api",
"type": "Deployment",
"status": "Configured",
"balanced": "true"
}
]
}
}
}
404 Response
{
"detail": "string",
"errors": {},
"instance": "http://example.com",
"status": 400,
"title": "string",
"type": "about:blank"
}
| Status | Meaning | Description | Schema |
|---|---|---|---|
| 200 | OK | Critical services status retrieved successfully | CriticalServicesStatusListSchema |
| 404 | Not Found | Not found | ProblemDetails |
| 500 | Internal Server Error | Internal server error | ProblemDetails |
To perform this operation, you must be authenticated by means of one of the following methods: bearerAuth
Code samples
GET https://api-gw-service-nmn.local/apis/rrs/criticalservices/status/{critical_service_name} HTTP/1.1
Host: api-gw-service-nmn.local
Accept: application/json
# You can also use wget
curl -X GET https://api-gw-service-nmn.local/apis/rrs/criticalservices/status/{critical_service_name} \
-H 'Accept: application/json' \
-H 'Authorization: Bearer {access-token}'
import requests
headers = {
'Accept': 'application/json',
'Authorization': 'Bearer {access-token}'
}
r = requests.get('https://api-gw-service-nmn.local/apis/rrs/criticalservices/status/{critical_service_name}', headers = headers)
print(r.json())
package main
import (
"bytes"
"net/http"
)
func main() {
headers := map[string][]string{
"Accept": []string{"application/json"},
"Authorization": []string{"Bearer {access-token}"},
}
data := bytes.NewBuffer([]byte{jsonReq})
req, err := http.NewRequest("GET", "https://api-gw-service-nmn.local/apis/rrs/criticalservices/status/{critical_service_name}", data)
req.Header = headers
client := &http.Client{}
resp, err := client.Do(req)
// ...
}
GET /criticalservices/status/{critical_service_name}
Get Critical Service Status by Name (Detailed)
Returns detailed status for a specific critical service, including pod information. The response includes:
| Name | In | Type | Required | Description |
|---|---|---|---|---|
| critical_service_name | path | ServiceName | true | The name of the critical service |
Example responses
Detailed critical service status retrieved successfully
{
"critical_service": {
"name": "cray-hbtd",
"namespace": "services",
"type": "Deployment",
"status": "Configured",
"balanced": "true",
"configured_instances": 3,
"currently_running_instances": 3,
"pods": [
{
"name": "cray-hbtd-6cbdbd6955-5xlfg",
"status": "Running",
"node": "ncn-w002",
"zone": "cscs-rack-x3001"
},
{
"name": "cray-hbtd-6cbdbd6955-jwzgq",
"status": "Running",
"node": "ncn-w003",
"zone": "cscs-rack-x3002"
},
{
"name": "cray-hbtd-6cbdbd6955-k6pkt",
"status": "Running",
"node": "ncn-w001",
"zone": "cscs-rack-x3000"
}
]
}
}
400 Response
{
"detail": "string",
"errors": {},
"instance": "http://example.com",
"status": 400,
"title": "string",
"type": "about:blank"
}
| Status | Meaning | Description | Schema |
|---|---|---|---|
| 200 | OK | Detailed critical service status retrieved successfully | CriticalServiceStatusDescribeSchema |
| 400 | Bad Request | Bad request | ProblemDetails |
| 404 | Not Found | Not found | ProblemDetails |
| 500 | Internal Server Error | Internal server error | ProblemDetails |
To perform this operation, you must be authenticated by means of one of the following methods: bearerAuth
Kubernetes health check endpoints for service readiness and liveness probes.
Code samples
GET https://api-gw-service-nmn.local/apis/rrs/healthz/ready HTTP/1.1
Host: api-gw-service-nmn.local
Accept: application/json
# You can also use wget
curl -X GET https://api-gw-service-nmn.local/apis/rrs/healthz/ready \
-H 'Accept: application/json' \
-H 'Authorization: Bearer {access-token}'
import requests
headers = {
'Accept': 'application/json',
'Authorization': 'Bearer {access-token}'
}
r = requests.get('https://api-gw-service-nmn.local/apis/rrs/healthz/ready', headers = headers)
print(r.json())
package main
import (
"bytes"
"net/http"
)
func main() {
headers := map[string][]string{
"Accept": []string{"application/json"},
"Authorization": []string{"Bearer {access-token}"},
}
data := bytes.NewBuffer([]byte{jsonReq})
req, err := http.NewRequest("GET", "https://api-gw-service-nmn.local/apis/rrs/healthz/ready", data)
req.Header = headers
client := &http.Client{}
resp, err := client.Do(req)
// ...
}
GET /healthz/ready
Retrieve RRS Readiness Probe
Readiness probe for RRS. This is used by Kubernetes to determine if RRS is ready to accept requests
Example responses
200 Response
{}
| Status | Meaning | Description | Schema |
|---|---|---|---|
| 200 | OK | RRS is ready to accept requests | EmptyDict |
| 500 | Internal Server Error | RRS is not able to accept requests | EmptyDict |
To perform this operation, you must be authenticated by means of one of the following methods: bearerAuth
Code samples
GET https://api-gw-service-nmn.local/apis/rrs/healthz/live HTTP/1.1
Host: api-gw-service-nmn.local
Accept: application/json
# You can also use wget
curl -X GET https://api-gw-service-nmn.local/apis/rrs/healthz/live \
-H 'Accept: application/json' \
-H 'Authorization: Bearer {access-token}'
import requests
headers = {
'Accept': 'application/json',
'Authorization': 'Bearer {access-token}'
}
r = requests.get('https://api-gw-service-nmn.local/apis/rrs/healthz/live', headers = headers)
print(r.json())
package main
import (
"bytes"
"net/http"
)
func main() {
headers := map[string][]string{
"Accept": []string{"application/json"},
"Authorization": []string{"Bearer {access-token}"},
}
data := bytes.NewBuffer([]byte{jsonReq})
req, err := http.NewRequest("GET", "https://api-gw-service-nmn.local/apis/rrs/healthz/live", data)
req.Header = headers
client := &http.Client{}
resp, err := client.Do(req)
// ...
}
GET /healthz/live
Retrieve RRS Liveness Probe
Liveness probe for RRS. This is used by Kubernetes to determine if RRS is responsive
Example responses
200 Response
{}
| Status | Meaning | Description | Schema |
|---|---|---|---|
| 200 | OK | RRS is responsive | EmptyDict |
| 500 | Internal Server Error | RRS is not responsive | EmptyDict |
To perform this operation, you must be authenticated by means of one of the following methods: bearerAuth
API version information endpoint.
Code samples
GET https://api-gw-service-nmn.local/apis/rrs/version HTTP/1.1
Host: api-gw-service-nmn.local
Accept: application/json
# You can also use wget
curl -X GET https://api-gw-service-nmn.local/apis/rrs/version \
-H 'Accept: application/json' \
-H 'Authorization: Bearer {access-token}'
import requests
headers = {
'Accept': 'application/json',
'Authorization': 'Bearer {access-token}'
}
r = requests.get('https://api-gw-service-nmn.local/apis/rrs/version', headers = headers)
print(r.json())
package main
import (
"bytes"
"net/http"
)
func main() {
headers := map[string][]string{
"Accept": []string{"application/json"},
"Authorization": []string{"Bearer {access-token}"},
}
data := bytes.NewBuffer([]byte{jsonReq})
req, err := http.NewRequest("GET", "https://api-gw-service-nmn.local/apis/rrs/version", data)
req.Header = headers
client := &http.Client{}
resp, err := client.Do(req)
// ...
}
GET /version
Get RRS version
Retrieve the version of the RRS Service
Example responses
200 Response
{
"version": "1.0.0"
}
| Status | Meaning | Description | Schema |
|---|---|---|---|
| 200 | OK | RRS Version | VersionSchema |
| 500 | Internal Server Error | Internal server error | ProblemDetails |
To perform this operation, you must be authenticated by means of one of the following methods: bearerAuth
"string"
Unique identifier name for the zone
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | string | false | none | Unique identifier name for the zone |
{
"Zones": [
{
"Zone_Name": "string",
"Kubernetes_Topology_Zone": {
"Management_Master_Nodes": [
"string"
],
"Management_Worker_Nodes": [
"string"
]
},
"CEPH_Zone": {
"Management_Storage_Nodes": [
"string"
]
}
}
]
}
Response containing all configured zones in the system with their associated node configurations
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| Zones | [ZoneItemSchema] | true | none | Array of zone configurations showing Kubernetes topology and CEPH zone details |
{
"Zone_Name": "string",
"Kubernetes_Topology_Zone": {
"Management_Master_Nodes": [
"string"
],
"Management_Worker_Nodes": [
"string"
]
},
"CEPH_Zone": {
"Management_Storage_Nodes": [
"string"
]
}
}
Configuration details for a single zone including Kubernetes topology and CEPH zone information
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| Zone_Name | ZoneName | true | none | Unique identifier name for the zone |
| Kubernetes_Topology_Zone | KubernetesTopologyZoneSchema | false | none | Kubernetes topology zone configuration containing master and worker node assignments |
| CEPH_Zone | CephZoneSchema | false | none | CEPH zone configuration containing storage node assignments |
{
"Management_Master_Nodes": [
"string"
],
"Management_Worker_Nodes": [
"string"
]
}
Kubernetes topology zone configuration containing master and worker node assignments
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| Management_Master_Nodes | [string] | false | none | List of Kubernetes master node names assigned to this zone |
| Management_Worker_Nodes | [string] | false | none | List of Kubernetes worker node names assigned to this zone |
{
"Management_Storage_Nodes": [
"string"
]
}
CEPH zone configuration containing storage node assignments
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| Management_Storage_Nodes | [string] | true | none | List of CEPH storage node names assigned to this zone |
{
"Zone_Name": "string",
"Management_Master": {
"Count": 1,
"Type": "Kubernetes_Topology_Zone",
"Nodes": [
{
"name": "string",
"status": "Ready"
}
]
},
"Management_Worker": {
"Count": 1,
"Type": "Kubernetes_Topology_Zone",
"Nodes": [
{
"name": "string",
"status": "Ready"
}
]
},
"Management_Storage": {
"Count": 1,
"Type": "CEPH_Zone",
"Nodes": [
{
"name": "string",
"status": "Ready",
"osds": {
"up": [
"string"
],
"down": [
"string"
]
}
}
]
}
}
Detailed information about a specific zone including node counts, types, and individual node status
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| Zone_Name | ZoneName | true | none | Unique identifier name for the zone |
| Management_Master | ManagementKubernetesSchema | false | none | Management information for Kubernetes nodes (master or worker) including count and individual node details |
| Management_Worker | ManagementKubernetesSchema | false | none | Management information for Kubernetes nodes (master or worker) including count and individual node details |
| Management_Storage | ManagementStorageSchema | false | none | Management information for CEPH storage nodes including count and individual node details with OSD information |
{
"Count": 1,
"Type": "Kubernetes_Topology_Zone",
"Nodes": [
{
"name": "string",
"status": "Ready"
}
]
}
Management information for Kubernetes nodes (master or worker) including count and individual node details
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| Count | integer | true | none | Total number of nodes of this type in the zone |
| Type | string | true | none | Type classification indicating this is a Kubernetes topology zone |
| Nodes | [NodeSchema] | true | none | Detailed information about each individual node |
| Property | Value |
|---|---|
| Type | Kubernetes_Topology_Zone |
{
"Count": 1,
"Type": "CEPH_Zone",
"Nodes": [
{
"name": "string",
"status": "Ready",
"osds": {
"up": [
"string"
],
"down": [
"string"
]
}
}
]
}
Management information for CEPH storage nodes including count and individual node details with OSD information
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| Count | integer | true | none | Total number of storage nodes in the zone |
| Type | string | true | none | Type classification indicating this is a CEPH zone |
| Nodes | [StorageNodeSchema] | true | none | Detailed information about each storage node including OSD status |
| Property | Value |
|---|---|
| Type | CEPH_Zone |
{
"name": "string",
"status": "Ready"
}
Basic node information including name and operational status
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| name | string | true | none | Unique name identifier for the node |
| status | string | true | none | Current operational status of the node |
| Property | Value |
|---|---|
| status | Ready |
| status | NotReady |
| status | Unknown |
{
"name": "string",
"status": "Ready",
"osds": {
"up": [
"string"
],
"down": [
"string"
]
}
}
Storage node information including CEPH OSD (Object Storage Daemon) status details
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| name | string | true | none | Unique name identifier for the storage node |
| status | string | true | none | Current operational status of the storage node |
| osds | OSDStatesSchema | true | none | Object Storage Daemon status information showing which OSDs are operational and non-operational |
| Property | Value |
|---|---|
| status | Ready |
| status | NotReady |
{
"up": [
"string"
],
"down": [
"string"
]
}
Object Storage Daemon status information showing which OSDs are operational and non-operational
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| up | [string] | false | none | List of OSD identifiers that are currently operational |
| down | [string] | false | none | List of OSD identifiers that are currently non-operational |
"rack-resiliency"
Kubernetes namespace name where a service is deployed
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | string | false | none | Kubernetes namespace name where a service is deployed |
"cray-dns-powerdns"
Name of the critical service
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | string | false | none | Name of the critical service |
"true"
Indicates whether a service is properly distributed across zones for high availability
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | string | false | none | Indicates whether a service is properly distributed across zones for high availability |
| Property | Value |
|---|---|
| anonymous | true |
| anonymous | false |
| anonymous | NA |
"error"
Current operational status of a critical service indicating its configuration and runtime state
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | string | false | none | Current operational status of a critical service indicating its configuration and runtime state |
| Property | Value |
|---|---|
| anonymous | error |
| anonymous | Configured |
| anonymous | PartiallyConfigured |
| anonymous | NotConfigured |
| anonymous | Running |
| anonymous | Unconfigured |
"Deployment"
Kubernetes resource type of the service
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | string | false | none | Kubernetes resource type of the service |
| Property | Value |
|---|---|
| anonymous | Deployment |
| anonymous | StatefulSet |
{
"critical_services": {
"namespace": {
"property1": [
{
"name": "cray-dns-powerdns",
"type": "Deployment"
}
],
"property2": [
{
"name": "cray-dns-powerdns",
"type": "Deployment"
}
]
}
}
}
Response containing all critical services organized by namespace
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| critical_services | object | true | none | Critical services grouped by their Kubernetes namespaces |
| » namespace | object | true | none | Mapping of namespace names to their contained critical services |
| »» additionalProperties | [CriticalServiceItemSchema] | false | none | [Basic information about a critical service including its name and Kubernetes resource type] |
{
"name": "cray-dns-powerdns",
"type": "Deployment"
}
Basic information about a critical service including its name and Kubernetes resource type
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| name | ServiceName | true | none | Name of the critical service |
| type | ServiceType | true | none | Kubernetes resource type of the service |
{
"critical_services": {
"property1": {
"namespace": "rack-resiliency",
"type": "Deployment"
},
"property2": {
"namespace": "rack-resiliency",
"type": "Deployment"
}
}
}
Configuration payload for updating critical services in the system. Contains service definitions that need to be monitored for resiliency
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| critical_services | object | true | none | Mapping from service names to their configuration details for monitoring |
| » additionalProperties | CriticalServiceCmStaticSchema | false | none | Static configuration details for a critical service that needs to be monitored |
{
"namespace": "rack-resiliency",
"type": "Deployment"
}
Static configuration details for a critical service that needs to be monitored
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| namespace | NamespaceName | true | none | Kubernetes namespace name where a service is deployed |
| type | ServiceType | true | none | Kubernetes resource type of the service |
{
"Update": "Successful",
"Successfully_Added_Services": [
"cray-dns-powerdns"
],
"Already_Existing_Services": [
"cray-dns-powerdns"
]
}
Response indicating the result of updating critical services configuration
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| Update | string | true | none | Overall status of the update operation |
| Successfully_Added_Services | [ServiceName] | true | none | List of service names that were successfully added to the configuration |
| Already_Existing_Services | [ServiceName] | true | none | List of service names that were already present in the configuration |
| Property | Value |
|---|---|
| Update | Successful |
| Update | Services Already Exist |
{
"critical_service": {
"name": "cray-dns-powerdns",
"namespace": "rack-resiliency",
"type": "Deployment",
"configured_instances": 1
}
}
Summarized view of a critical service without runtime details
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| critical_service | object | true | none | Basic configuration information about the critical service |
| » name | ServiceName | true | none | Name of the critical service |
| » namespace | NamespaceName | true | none | Kubernetes namespace name where a service is deployed |
| » type | ServiceType | true | none | Kubernetes resource type of the service |
| » configured_instances | number | true | none | Number of instances configured for this service (replicas for Deployments/StatefulSets) |
{
"critical_service": {
"name": "cray-dns-powerdns",
"namespace": "rack-resiliency",
"type": "Deployment",
"status": "error",
"balanced": "true",
"configured_instances": 1,
"currently_running_instances": 0,
"pods": [
{
"name": "string",
"node": "string",
"status": "Running",
"zone": "string"
}
]
}
}
Detailed status information for a critical service including runtime details and pod information
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| critical_service | object | true | none | Complete status information including configuration and runtime details |
| » name | ServiceName | true | none | Name of the critical service |
| » namespace | NamespaceName | true | none | Kubernetes namespace name where a service is deployed |
| » type | ServiceType | true | none | Kubernetes resource type of the service |
| » status | ServiceStatus | true | none | Current operational status of a critical service indicating its configuration and runtime state |
| » balanced | ServiceBalanced | true | none | Indicates whether a service is properly distributed across zones for high availability |
| » configured_instances | number | true | none | Number of instances configured for this service |
| » currently_running_instances | number | true | none | Number of instances currently running and healthy |
| » pods | [PodSchema] | true | none | Detailed information about each pod instance of this service |
{
"name": "string",
"node": "string",
"status": "Running",
"zone": "string"
}
Information about an individual pod instance including its location and status
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| name | string | true | none | Unique name of the pod instance |
| node | string | true | none | Kubernetes node where the pod is scheduled to run |
| status | string | true | none | Current operational status of the pod |
| zone | string | true | none | Zone where the pod is located based on its assigned node |
| Property | Value |
|---|---|
| status | Running |
| status | Pending |
| status | Failed |
| status | Terminating |
{
"critical_services": {
"namespace": {
"property1": [
{
"name": "cray-dns-powerdns",
"type": "Deployment",
"status": "error",
"balanced": "true"
}
],
"property2": [
{
"name": "cray-dns-powerdns",
"type": "Deployment",
"status": "error",
"balanced": "true"
}
]
}
}
}
Status overview of all critical services organized by namespace, showing their operational state and distribution
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| critical_services | object | true | none | Critical services grouped by namespace with their status information |
| » namespace | object | true | none | Mapping of namespace names to their contained critical services with status |
| »» additionalProperties | [CriticalServiceStatusItemSchema] | false | none | [Status summary for a critical service showing its operational state and distribution across zones] |
{
"name": "cray-dns-powerdns",
"type": "Deployment",
"status": "error",
"balanced": "true"
}
Status summary for a critical service showing its operational state and distribution across zones
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| name | ServiceName | true | none | Name of the critical service |
| type | ServiceType | true | none | Kubernetes resource type of the service |
| status | ServiceStatus | true | none | Current operational status of a critical service indicating its configuration and runtime state |
| balanced | ServiceBalanced | true | none | Indicates whether a service is properly distributed across zones for high availability |
{
"version": "1.0.0"
}
Version information for the Rack Resiliency Service
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| version | string | true | none | The current version of the RRS Service |
{
"detail": "string",
"errors": {},
"instance": "http://example.com",
"status": 400,
"title": "string",
"type": "about:blank"
}
An error response for RFC 7807 problem details.
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| detail | string | false | none | A human-readable explanation specific to this occurrence of the problem. Focus on helping correct the problem, rather than giving debugging information. |
| errors | object | false | none | An object denoting field-specific errors. Only present on error responses when field input is specified for the request. |
| instance | string(uri) | false | none | A relative URI reference that identifies the specific occurrence of the problem |
| status | integer | false | none | HTTP status code |
| title | string | false | none | Short, human-readable summary of the problem, should not change by occurrence. |
| type | string(uri) | false | none | Relative URI reference to the type of problem which includes human-readable documentation. |
{}
Empty response object typically returned by health check endpoints to indicate successful operation
None