Re: non-effective new deep scrub interval

David DELON <david.delon@xxxxxxxxxx> · Thu, 8 Sep 2016 17:09:27 +0200 (CEST)

First, thanks for your answer Christian.

----- Le 8 Sep 16, à 13:30, Christian Balzer chibi@xxxxxxx a écrit :

> Hello,
> 
> On Thu, 8 Sep 2016 09:48:46 +0200 (CEST) David DELON wrote:
> 
>> 
>> Hello,
>> 
>> i'm using ceph jewel.
>> I would like to schedule the deep scrub operations on my own.
> 
> Welcome to the club, alas the ride isn't for the faint of heart.
> 
> You will want to (re-)search the ML archive (google) and in particular the
> recent "Spreading deep-scrubbing load" thread.

It is not exactly what i would like to do. That's why i have posted.
I wanted to trigger on my own the deep scrubbing on sundays with a cron script... 

>> First of all, i have tried to change the interval value for 30 days:
>> In each /etc/ceph/ceph.conf, i have added:
>> 
>> [osd]
>> #30*24*3600
>> osd deep scrub interval = 2592000
>> I have restarted all the OSD daemons.
> 
> This could have been avoided by an "inject" for all OSDs.
> Restarting (busy) OSDs isn't particular nice for a cluster.

I have first done the inject of the new value. But as it did not the trick after some hours and the "injectargs" command have returned
"(unchangeable)"
i have thought OSD restarts were needed... 

>> The new value has been taken into account as for each OSD:
>> 
>> ceph --admin-daemon /var/run/ceph/ceph-osd.X.asok config show | grep
>> deep_scrub_interval
>> "osd_deep_scrub_interval": "2.592e+06",
>> 
>> 
>> I have checked the last_deep_scrub value for each pg with
>> ceph pg dump
>> And each pg has been deep scrubbed during the last 7 days (which is the default
>> behavior).
>> 
> See the above thread.
> 
>> Since i have made the changes 2 days ago, it keeps on deep scrubbing.
>> Do i miss something?
>> 
> At least 2 things, maybe more.
> 
> Unless you changed the "osd_scrub_max_interval" as well, that will enforce
> things, by default after a week.

Increasing osd_scrub_max_interval and osd_scrub_min_interval does not solve.

> And with Jewel you get that well meaning, but turned on by default and
> ill-documented "osd_scrub_interval_randomize_ratio", which will spread
> things out happily and not when you want them.
> 
> Again, read the above thread.
> 
> Also your cluster _should_ be able to endure deep scrubs even when busy,
> otherwise you're looking at trouble when you loose and OSD and the
> resulting balancing as well.
> 
> Setting these to something sensible:
>    "osd_scrub_begin_hour": "0",
>    "osd_scrub_end_hour": "6",
> 
> and especially this:
>    "osd_scrub_sleep": "0.1",

OK, i will consider this solution.

> will minimize the impact of scrub as well.
> 
> Christian
> --
> Christian Balzer        Network/Systems Engineer
> chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
> http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com