How to control automatic deep-scrubs

Eugen Block <eblock@xxxxxx> · Wed, 13 Feb 2019 11:35:54 +0000

Hi cephers,

I'm struggling a little with the deep-scrubs. I know this has been  
discussed multiple times (e.g. in [1]) and we also use a known crontab  
script in a Luminous cluster (12.2.10) to start the deep-scrubbing  
manually (a quarter of all PGs 4 times a week). The script works just  
fine, but it doesn't prevent the automatic deep-scrubs initiated by  
ceph itself.
These are the relevant config settings:

osd_scrub_begin_hour = 0
osd_scrub_end_hour = 7
osd_scrub_sleep = 0.1
osd_deep_scrub_interval = 2419200

The expectation was to prevent the automatic deep-scrubs but they are  
started anyway, and they are executed between midnight and 7 am, so at  
least some of the configs are "honored". I took a look at one specific  
PG:

2019-02-06 22:52:03.438079 7fd7f19cb700  0 log_channel(cluster) log  
[DBG] : 1.b7d deep-scrub starts
2019-02-06 22:52:24.909413 7fd7f19cb700  0 log_channel(cluster) log  
[DBG] : 1.b7d deep-scrub ok
2019-02-11 00:39:42.941238 7fd7f19cb700  0 log_channel(cluster) log  
[DBG] : 1.b7d deep-scrub starts
2019-02-11 00:40:04.447500 7fd7f19cb700  0 log_channel(cluster) log  
[DBG] : 1.b7d deep-scrub ok
2019-02-12 01:35:17.898666 7f97e42fa700  0 log_channel(cluster) log  
[DBG] : 1.b7d deep-scrub starts
2019-02-12 01:35:39.145579 7f97e42fa700  0 log_channel(cluster) log  
[DBG] : 1.b7d deep-scrub ok

The scrubs from 2019-02-06 are from the cronjob, the 2019-02-11 scrub  
was an automatic scrub, the last one could be automatic or manual,  
hard to tell because the cronjob starts at 20:00 and usually ends at  
about 5:30 in the morning. Anyway, I wouldn't expect a PG to be  
deep-scrubbed twice within 24 hours.

Then I continued looking for other config options etc., maybe we  
missed something, and I stumbled upon [2], where it says:

PGs are normally scrubbed every osd_deep_mon_scrub_interval seconds

So I searched for that config option with "ceph daemon [...] config  
show" but couldn't find anything in a Luminous or Mimic cluster.  
Setting that value in the ceph.conf of a test cluster (and restarting  
the cluster) also doesn't show it in the config dump. Is this a  
mistake in the docs? Could it be related to my question?

Of course I could set the nodeep-scrub flag to prevent automatic  
scrubs, but I consider osd flags as temporary. How do you handle your  
deep-scrubs? Any hints are appreciated!

Best regards,
Eugen

[1]  
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/021118.html
[2] http://docs.ceph.com/docs/luminous/rados/operations/health-checks/

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com