NOTE: This page is only applicable to storage NCNs on Gigabyte or HPE hardware.
Utility storage nodes (also known as storage NCNs) are expected to have a particular number of OSDs based on the type of NCN hardware. This page describes how to validate that the storage NCNs have the expected number of OSDs, and provides remediation steps if this is not the case.
(ncn-s#
) Perform all procedures on this page on any of the first three storage NCNs (ncn-s001
, ncn-s002
, or ncn-s003
).
The following table shows the expected number of OSDs for every storage NCN.
Hardware Manufacturer | OSD Drive Count per Storage NCN (not including OS drives) |
---|---|
GigaByte | 12 |
HPE | 8 |
The expected total number of OSDs on the system is determined by taking the number in the second column of the above table and multiplying it by the number of storage NCNs in the system. That is:
total_osds
= (Number of utility storage nodes)
*
(OSD drive count per storage NCN)
Record the hardware type of the storage NCNs.
HWTYPE=$(ipmitool mc info | grep "^Manufacturer Name" | sed 's/^Manufacturer Name[[:space:]][[:space:]]*:[[:space:]][[:space:]]*//')
echo "${HWTYPE}"
Example output on a system with HPE hardware:
Hewlett Packard Enterprise
Record the number of storage NCNs on the system.
Be sure to set the NUM_STORAGE_NCNS
variable to the actual number of storage NCNs on the system.
NUM_STORAGE_NCNS=x
Execute the automatic check.
/opt/cray/tests/install/ncn/scripts/python/check_for_unused_drives.py "${NUM_STORAGE_NCNS}" "${HWTYPE}"
The final line of output will report whether or not the number of OSDs found matches what is expected.
Example of successful output on system with five HPE storage nodes:
2022-09-20 16:00:44.651 DEBUG Parsing command line arguments: ['/opt/cray/tests/install/ncn/scripts/python/check_for_unused_drives.py', '5', 'Hewlett Packard Enterprise']
2022-09-20 16:00:44.652 INFO Based on hardware type (hpe) and number of storage nodes (5): min_expected_osds = 40, max_expected_osds = 40
2022-09-20 16:00:44.652 DEBUG Loading Ceph
2022-09-20 16:00:44.657 DEBUG Connecting to Ceph
2022-09-20 16:00:44.664 INFO Running ceph osd stat command
2022-09-20 16:00:44.671 INFO Command return code = 0
2022-09-20 16:00:44.671 DEBUG Decoding command output to string:
{
"epoch": 2634,
"num_osds": 40,
"num_up_osds": 40,
"osd_up_since": 1663023364,
"num_in_osds": 40,
"osd_in_since": 1659470908,
"num_remapped_pgs": 0
}
2022-09-20 16:00:44.671 DEBUG Decoding command output from JSON into object
2022-09-20 16:00:44.671 DEBUG Extracting number of OSDs from object
2022-09-20 16:00:44.671 INFO num_osds = 40
2022-09-20 16:00:44.671 INFO SUCCESS -- number of OSDs found matches expectations based on hardware type
If the number of OSDs does not match what is expected, then proceed to Manual checks and remediation.
If there are OSDs on each node (ceph osd tree
can show this), then all the nodes are in Ceph. That means the orchestrator can be used to look for the devices. In that case, begin
by following Option 1. Otherwise, proceed to Option 2.
Get the number of OSDs in the cluster.
ceph -f json-pretty osd stat |jq .num_osds
Example output:
24
IMPORTANT: If the returned number of OSDs is equal to the Expected number of OSDs calculated earlier, then skip the following steps. If not, then proceed with the below additional checks and remediation steps.
Compare the number of OSDs to what is listed by Ceph.
NOTE: If the Ceph cluster is large and has a lot of nodes, then a node may be specified after the below command to limit the results.
ceph orch device ls
Example output:
Hostname Path Type Serial Size Health Ident Fault Available
ncn-s001 /dev/sda ssd PHYF015500M71P9DGN 1920G Unknown N/A N/A No
ncn-s001 /dev/sdb ssd PHYF016500TZ1P9DGN 1920G Unknown N/A N/A No
ncn-s001 /dev/sdc ssd PHYF016402EB1P9DGN 1920G Unknown N/A N/A No
ncn-s001 /dev/sdd ssd PHYF016504831P9DGN 1920G Unknown N/A N/A No
ncn-s001 /dev/sde ssd PHYF016500TV1P9DGN 1920G Unknown N/A N/A No
ncn-s001 /dev/sdf ssd PHYF016501131P9DGN 1920G Unknown N/A N/A No
ncn-s001 /dev/sdi ssd PHYF016500YB1P9DGN 1920G Unknown N/A N/A No
ncn-s001 /dev/sdj ssd PHYF016500WN1P9DGN 1920G Unknown N/A N/A No
ncn-s002 /dev/sda ssd PHYF0155006W1P9DGN 1920G Unknown N/A N/A No
ncn-s002 /dev/sdb ssd PHYF0155006Z1P9DGN 1920G Unknown N/A N/A No
ncn-s002 /dev/sdc ssd PHYF015500L61P9DGN 1920G Unknown N/A N/A No
ncn-s002 /dev/sdd ssd PHYF015502631P9DGN 1920G Unknown N/A N/A No
ncn-s002 /dev/sde ssd PHYF0153000G1P9DGN 1920G Unknown N/A N/A No
ncn-s002 /dev/sdf ssd PHYF016401T41P9DGN 1920G Unknown N/A N/A No
ncn-s002 /dev/sdi ssd PHYF016504C21P9DGN 1920G Unknown N/A N/A No
ncn-s002 /dev/sdj ssd PHYF015500GQ1P9DGN 1920G Unknown N/A N/A No
ncn-s003 /dev/sda ssd PHYF016402FP1P9DGN 1920G Unknown N/A N/A No
ncn-s003 /dev/sdb ssd PHYF016401TE1P9DGN 1920G Unknown N/A N/A No
ncn-s003 /dev/sdc ssd PHYF015500N51P9DGN 1920G Unknown N/A N/A No
ncn-s003 /dev/sdd ssd PHYF0165010Z1P9DGN 1920G Unknown N/A N/A No
ncn-s003 /dev/sde ssd PHYF016500YR1P9DGN 1920G Unknown N/A N/A No
ncn-s003 /dev/sdf ssd PHYF016500X01P9DGN 1920G Unknown N/A N/A No
ncn-s003 /dev/sdi ssd PHYF0165011H1P9DGN 1920G Unknown N/A N/A No
ncn-s003 /dev/sdj ssd PHYF016500TQ1P9DGN 1920G Unknown N/A N/A No
If there are devices that show Available
as Yes
and they are not being automatically added, then those devices may need to be zapped.
IMPORTANT: Prior to zapping any device, ensure that it is not being used.
Check to see if the number of devices is less than the number of listed drives in the output from step 1.
ceph orch device ls|grep dev|wc -l
Example output:
24
If the numbers are equal, but less than the total_osds
calculated, then the ceph-mgr
daemon may need to be failed in order to get a fresh inventory.
Fail the ceph-mgr
daemon.
ceph mgr fail $(ceph mgr dump | jq -r .active_name)
Wait five minutes and then re-check ceph orch device ls
.
See if the drives are still showing as Available
. If so, then proceed to the next step.
Compare lsblk
output on each storage node against the device list from ceph orch device ls
.
lsblk
Example output:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 4.2G 1 loop / run/ rootfsbase
loop1 7:1 0 30G 0 loop
└─live-overlay-pool 254:8 0 300G 0 dm
loop2 7:2 0 300G 0 loop
└─live-overlay-pool 254:8 0 300G 0 dm
sda 8:0 0 1.8T 0 disk
└─ceph--0a476f53--8b38--450d--8779--4e587402f8a8-osd--data--b620b7ef--184a--46d7--9a99--771239e7a323 254:7 0 1.8T 0 lvm
If a device has an LVM volume like above, then it may be in use. In that case, perform the Option 2 check below to make sure that the drive can be wiped. Otherwise, proceed to the Wipe and add drives procedure.
Log into each storage NCN and check for unused drives. There are two ways to do this:
List all drives on the node.
cephadm shell -- ceph-volume inventory
Note: The following warning message from the
cephadm
command should be ignored if it is seen:WARNING: The same type, major and minor should not be used for multiple devices.
In the output of the command, the available
field will be True
if Ceph sees the drive as empty (and therefore available for use). For example:
Device Path Size rotates available Model name
/dev/sda 447.13 GB False False SAMSUNG MZ7LH480
/dev/sdb 447.13 GB False False SAMSUNG MZ7LH480
/dev/sdc 3.49 TB False False SAMSUNG MZ7LH3T8
/dev/sdd 3.49 TB False False SAMSUNG MZ7LH3T8
/dev/sde 3.49 TB False False SAMSUNG MZ7LH3T8
/dev/sdf 3.49 TB False False SAMSUNG MZ7LH3T8
/dev/sdg 3.49 TB False False SAMSUNG MZ7LH3T8
/dev/sdh 3.49 TB False False SAMSUNG MZ7LH3T8
List only the paths of available drives on the node.
cephadm shell -- ceph-volume inventory --format json-pretty | jq -r '.[]|select(.available==true)|.path'
Note: The following warning message from the
cephadm
command should be ignored if it is seen:WARNING: The same type, major and minor should not be used for multiple devices.
Wipe a drive ONLY after confirming that the drive is not being used by the current Ceph cluster by using the above procedures.
The following example wipes drive /dev/sdc
on ncn-s002
. Replace these values with the appropriate ones for the situation.
ceph orch device zap ncn-s002 /dev/sdc --force
Add unused drives.
Run this command on the storage NCN containing the drive that was zapped. Be sure to modify the following command to specify the actual name of the zapped drive.
cephadm shell -- ceph-volume lvm create --data /dev/sd<drive to add> --bluestore
More information can be found at the cephadm
reference page.