Hi,
check out the docs [0] or my blog post [1]. Either set the new
interval globally, or at least for the mgr as well, otherwise it will
still check for the default interval.
Regards,
Eugen
[0]
https://docs.ceph.com/en/latest/rados/operations/health-checks/#first-method
[1]
http://heiterbiswolkig.blogs.nde.ag/2024/09/06/pgs-not-deep-scrubbed-in-time/
Zitat von Jan Kasprzak <kas@xxxxxxxxxx>:
Hello, Ceph users,
a question/problem related to deep scrubbing:
I have a HDD-based Ceph 18 cluster currently with 34 osds and 600-ish pgs.
In order to avoid latency peaks which apparently correlate with HDD being
100 % busy for several hours during a deep scrub, I wanted to relax the
scrubbing frequency and concurrency. Six days ago I have modified
the following config parameters:
ceph config set osd osd_scrub_max_interval 2592000 # was 604800
ceph config set osd osd_deep_scrub_interval 2592000 # was 604800
ceph config set osd osd_max_scrubs 1 # was 3
The intervals are four times longer and max_scrubs is three times lower,
so the total scrubbing load should _decrease_. However, just several hours
after the config change, my cluster went to HEALTH_WARN with
"XX pgs not deep-scrubbed in time".
I thought this was something temporary, but six days later,
the number of pgs not scrubbed in time still grows - now it says 58.
In "ceph -s" there are about 6-8 pgs active+clean+scrubbing,
and 4-6 active+clean+scrubbing+deep all the time, so scrubbing
still happens.
According to "ceph pg dump" a deep scrub of a pg takes about 9000 seconds.
It seems all pgs are scheduled to be scrubbed in the next two days:
# for i in `seq 18 24`; do echo -n "2024-12-$i "; ceph pg dump
2>/dev/null | grep -c "scheduled @ 2024-12-$i"; done
2024-12-18 152
2024-12-19 422
2024-12-20 7
2024-12-21 0
2024-12-22 0
2024-12-23 0
2024-12-24 0
The positive thing is that latency-wise it helped: with at most one pg being
deep-scrubbed per OSD at any time, the utilization of the HDD never grows
near 100 %, it stays at ~60 % when deep scrubbing is in progress on that OSD.
Is there any other config parameter which I should modify together
with the above three parameters?
Thanks!
-Yenya
--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| https://www.fi.muni.cz/~kas/ GPG: 4096R/A45477D5 |
We all agree on the necessity of compromise. We just can't agree on
when it's necessary to compromise. --Larry Wall
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx