> Op 28 oktober 2016 om 13:18 schreef Kees Meijs <kees@xxxxxxxx>: > > > Hi, > > On 28-10-16 12:06, wido@xxxxxxxx wrote: > > I don't like this personally. Your cluster should be capable of doing > > a deep scrub at any moment. If not it will also not be able to handle > > a node failure during peak times. > > Valid point and I totally agree. Unfortunately, the current load doesn't > give me much of a choice I'm afraid. Tweaking and extending the cluster > hardware (e.g. more and faster spinners) makes more sense but we're not > there yet. > Ok, just wanted to mention it. > Maybe the new parameters help us towards the "always capable" momentum. > Let's hope for the best and see what'll happen. ;-) If it works out, I > could (and will) remove the time constraints. > > > * osd_scrub_sleep .1 > > > > You can try to bump that even more. > > Thank you for pointing that out. I'm unsure about the osd_scrub_sleep > parameter behaviour (documentation is scarce). Could you please shed a > little light on this? It is how much time it sleeps between a scrub operation. It gives the underlying time to do other I/O. https://github.com/ceph/ceph/blob/master/src/osd/PG.cc#L3949 The scrub will take more time, but it will also be less intensive. While doing that you also might want to set the priorities: osd disk thread ioprio class = idle osd disk thread ioprio priority = 3 osd recovery op priority = 5 osd client op priority = 63 Make sure you use the CFQ disk scheduler for your disks though. Wido > > Cheers, > Kees > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com