Deep-Scrub Scheduling

mike.dawson@xxxxxxxxxxxx (Mike Dawson) · Wed, 07 May 2014 23:47:33 -0400

Perhaps, but if that were the case, would you expect the max concurrent 
number of deep-scrubs to approach the number of OSDs in the cluster?

I have 72 OSDs in this cluster and concurrent deep-scrubs seem to peak 
at a max of 12. Do pools (two in use) and replication settings (3 copies 
in both pools) factor in?

72 OSDs / (2 pools * 3 copies) = 12 max concurrent deep-scrubs

That seems plausible (without looking at the code).

But, if I 'ceph osd set nodeep-scrub' then 'ceph osd unset 
nodeep-scrub', the count of concurrent deep-scrubs doesn't resume the 
high level, but rather stays low seemingly for days at a time, until the 
next onslaught. If driven by the max scrub interval, shouldn't it jump 
quickly back up?

Is there way to find the last scrub time for a given PG via the CLI to 
know for sure?

Thanks,
Mike Dawson

On 5/7/2014 10:59 PM, Gregory Farnum wrote:
> Is it possible you're running into the max scrub intervals and jumping
> up to one-per-OSD from a much lower normal rate?
>
> On Wednesday, May 7, 2014, Mike Dawson <mike.dawson at cloudapt.com
> <mailto:mike.dawson at cloudapt.com>> wrote:
>
>     My write-heavy cluster struggles under the additional load created
>     by deep-scrub from time to time. As I have instrumented the cluster
>     more, it has become clear that there is something I cannot explain
>     happening in the scheduling of PGs to undergo deep-scrub.
>
>     Please refer to these images [0][1] to see two graphical
>     representations of how deep-scrub goes awry in my cluster. These
>     were two separate incidents. Both show a period of "happy" scrub and
>     deep-scrubs and stable writes/second across the cluster, then an
>     approximately 5x jump in concurrent deep-scrubs where client IO is
>     cut by nearly 50%.
>
>     The first image (deep-scrub-issue1.jpg) shows a happy cluster with
>     low numbers of scrub and deep-scrub running until about 10pm, then
>     something triggers deep-scrubs to increase about 5x and remain high
>     until I manually 'ceph osd set nodeep-scrub' at approx 10am. During
>     the time of higher concurrent deep-scrubs, IOPS drop significantly
>     due to OSD spindle contention preventing qemu/rbd clients from
>     writing like normal.
>
>     The second image (deep-scrub-issue2.jpg) shows a similar approx 5x
>     jump in concurrent deep-scrubs and associated drop in writes/second.
>     This image also adds a summary of the 'dump historic ops' which show
>     the to be expected jump in the slowest ops in the cluster.
>
>     Does anyone have an idea of what is happening when the spike in
>     concurrent deep-scrub occurs and how to prevent the adverse effects,
>     outside of disabling deep-scrub permanently?
>
>     0: http://www.mikedawson.com/__deep-scrub-issue1.jpg
>     <http://www.mikedawson.com/deep-scrub-issue1.jpg>
>     1: http://www.mikedawson.com/__deep-scrub-issue2.jpg
>     <http://www.mikedawson.com/deep-scrub-issue2.jpg>
>
>     Thanks,
>     Mike Dawson
>     _________________________________________________
>     ceph-users mailing list
>     ceph-users at lists.ceph.com
>     http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
>     <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>
>
>
> --
> Software Engineer #42 @ http://inktank.com | http://ceph.com