During normal operation, the Ceph cluster performs deep scrubs of the placement groups (PGs) during intervals of low I/O activity on the cluster. By default, these deep scrubs occur on a weekly interval. Scheduling of deep scrubs is staggered across the PGs in the Ceph cluster, so that all PGs are not deep-scrubbed at the same time.
When one or more OSDs are down, the deep scrubbing of the PGs on those OSDs cannot be performed. If a deep scrub of a PG is scheduled to occur while the OSD is down, the deep scrubbing will be delayed until the OSDs are available. This commonly occurs when the storage nodes are powered down as part of the System Power Off Procedures.
After a prolonged power outage, for example after weekend power maintenance activities, some number of PGs may begin a deep scrub after the system is powered on. An alert will be displayed in the Ceph status while the deep scrub is occurring. Ceph is fully operational while that alert is present, and the alert should clear when scrubbing is completed. The time to complete deep scrubbing depends on the size of the cluster and the length of the outage. If the alert remains for more than a day, contact support.
The following example output from ceph -s
shows Ceph in a HEALTH_WARN
state due to some deep
scrubs missed after the system was brought up after power down:
cluster:
id: e67366fb-7d13-4219-bdb4-44a5f7e06bf9
health: HEALTH_WARN
7 pgs not deep-scrubbed in time
...
Note the message accompanying the HEALTH_WARN
state indicating 7 pgs not deep-scrubbed in time
.
This alert will clear when deep scrubbing completes.
The ceph pg dump
command shows information about the PGs in the Ceph cluster. This command can be
used to see the last time PGs were scrubbed and thus infer the next time they will be scrubbed. For
example, the following command will get the last deep scrub time for each PG, convert it to the day
of the week, and then count the number of PGs scheduled for deep scrub each day of the week:
ceph pg dump -f json | jq -r '.pg_map.pg_stats | .[].last_deep_scrub_stamp' | xargs -n 1 date +%A -d | sort | uniq -c
The output of this command will look something like the following:
dumped all
52 Friday
89 Monday
61 Saturday
57 Sunday
54 Thursday
156 Tuesday
116 Wednesday