Deep-Scrub Scheduling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Is it possible you're running into the max scrub intervals and jumping up
to one-per-OSD from a much lower normal rate?

On Wednesday, May 7, 2014, Mike Dawson <mike.dawson at cloudapt.com> wrote:

> My write-heavy cluster struggles under the additional load created by
> deep-scrub from time to time. As I have instrumented the cluster more, it
> has become clear that there is something I cannot explain happening in the
> scheduling of PGs to undergo deep-scrub.
>
> Please refer to these images [0][1] to see two graphical representations
> of how deep-scrub goes awry in my cluster. These were two separate
> incidents. Both show a period of "happy" scrub and deep-scrubs and stable
> writes/second across the cluster, then an approximately 5x jump in
> concurrent deep-scrubs where client IO is cut by nearly 50%.
>
> The first image (deep-scrub-issue1.jpg) shows a happy cluster with low
> numbers of scrub and deep-scrub running until about 10pm, then something
> triggers deep-scrubs to increase about 5x and remain high until I manually
> 'ceph osd set nodeep-scrub' at approx 10am. During the time of higher
> concurrent deep-scrubs, IOPS drop significantly due to OSD spindle
> contention preventing qemu/rbd clients from writing like normal.
>
> The second image (deep-scrub-issue2.jpg) shows a similar approx 5x jump in
> concurrent deep-scrubs and associated drop in writes/second. This image
> also adds a summary of the 'dump historic ops' which show the to be
> expected jump in the slowest ops in the cluster.
>
> Does anyone have an idea of what is happening when the spike in concurrent
> deep-scrub occurs and how to prevent the adverse effects, outside of
> disabling deep-scrub permanently?
>
> 0: http://www.mikedawson.com/deep-scrub-issue1.jpg
> 1: http://www.mikedawson.com/deep-scrub-issue2.jpg
>
> Thanks,
> Mike Dawson
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>


-- 
Software Engineer #42 @ http://inktank.com | http://ceph.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140507/61af7959/attachment.htm>


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux