Repopulate Data in etcd Clusters When Rebuilding Them

When an etcd cluster is not healthy, it needs to be rebuilt. During that process, the pods that rely on etcd clusters lose data. That data needs to be repopulated in order for the cluster to go back to a healthy state.

Applicable services

The following services need their data repopulated in the etcd cluster:

  • Boot Orchestration Service (BOS)
  • Boot Script Service (BSS)
  • Firmware Action Service (FAS)
  • Mountain Endpoint Discovery Service (MEDS)

Prerequisites

An etcd cluster was rebuilt. See Rebuild Unhealthy etcd Clusters.

Procedures

BOS

BOS Session Templates

Boot preparation information can be found in the following locations:

Always consult the latest USS Administration Guide and CSM documentation for authoritative steps and best practices.

BSS

Restore BSS from the ETCD backup see Restore an ETCD Cluster from a Backup

FAS

Reload the firmware images from Nexus.

Refer to the Load Firmware from Nexus section in FAS Admin Procedures for more information.

When the etcd cluster is rebuilt, all historic data for firmware actions and all recorded snapshots will be lost. Image data will be reloaded from Nexus. Any images that were loaded into FAS outside of Nexus will need to be reloaded using the Load Firmware from RPM or ZIP file section in FAS Admin Procedures. After images are reloaded, any running actions at time of failure will need to be recreated.