Re: RFC: (deep-)scrub manager module

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/19/22 09:04, Josh Salomon wrote:
I have one comment - I wouldn't use the word throttling but rather scheduling since we don't just want to limit scrubs we need some other policy.

Good point.

As long as we have a single PG scrub executing we can run
scrubs on all the non-scrubbed OSD for free (since we already pay for the performance degradation) therefore we need a plan on how to execute as many scrubs simultaneously as long as all the OSDs are loaded evenly. For example, assume we have 100 OSDs and replica 3, we would like that when scrub runs we will have 33 PGs scrubbed simultaneously as long as no OSD appears in more than 1 PG so from OSD perspective 99 OSDs will execute scrub simultaneously (we can't get to 100 with 1 scrub only with 3 simultaneous scrubs per OSD).

Yes, for sure this is an interesting scrub strategy I hadn't thought about. With the reasoning behind it: make sure you scrub as many PGs as you can when allowed to, not wasting time (as long as its evenly distributed among OSDs). Do I get that right?

Such a plan, with the other policies described (starting with the oldest scrubbed OSDs) should create an optimal plan when all the OSDs are symmetrical (same capacity and technology). Improving it for different capacities and technologies is an interesting exercise for future phases.

Indeed. First start with most simple case. But good to know beforehand so the actual implementation can anticipate on future improvements.

One last point - we may want different priorities per pool (one pool requires weekly scrubs and another monthly scrubs), this should also be part of the scheduling algorithm.

As you mention priorities: should it have some sort of fairness algorithm that avoids situations where a pool might not be scrubbed at all because of the constraints imposed? I can imagine a heavily loaded cluster, with spinning disks, where a high priority pool might get scrubbed, but a lower priority pool might not. In that case you might want to scrub PGs from the lower priority pool every x period to avoid the pool never being scrubbed at all. This might make things (overly) complicated though.

Gr. Stefan
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux