Is it possible you're running into the max scrub intervals and jumping up to one-per-OSD from a much lower normal rate? On Wednesday, May 7, 2014, Mike Dawson <mike.dawson at cloudapt.com> wrote: > My write-heavy cluster struggles under the additional load created by > deep-scrub from time to time. As I have instrumented the cluster more, it > has become clear that there is something I cannot explain happening in the > scheduling of PGs to undergo deep-scrub. > > Please refer to these images [0][1] to see two graphical representations > of how deep-scrub goes awry in my cluster. These were two separate > incidents. Both show a period of "happy" scrub and deep-scrubs and stable > writes/second across the cluster, then an approximately 5x jump in > concurrent deep-scrubs where client IO is cut by nearly 50%. > > The first image (deep-scrub-issue1.jpg) shows a happy cluster with low > numbers of scrub and deep-scrub running until about 10pm, then something > triggers deep-scrubs to increase about 5x and remain high until I manually > 'ceph osd set nodeep-scrub' at approx 10am. During the time of higher > concurrent deep-scrubs, IOPS drop significantly due to OSD spindle > contention preventing qemu/rbd clients from writing like normal. > > The second image (deep-scrub-issue2.jpg) shows a similar approx 5x jump in > concurrent deep-scrubs and associated drop in writes/second. This image > also adds a summary of the 'dump historic ops' which show the to be > expected jump in the slowest ops in the cluster. > > Does anyone have an idea of what is happening when the spike in concurrent > deep-scrub occurs and how to prevent the adverse effects, outside of > disabling deep-scrub permanently? > > 0: http://www.mikedawson.com/deep-scrub-issue1.jpg > 1: http://www.mikedawson.com/deep-scrub-issue2.jpg > > Thanks, > Mike Dawson > _______________________________________________ > ceph-users mailing list > ceph-users at lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Software Engineer #42 @ http://inktank.com | http://ceph.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140507/61af7959/attachment.htm>