Re: Squid: deep scrub issues

Frédéric Nass <frederic.nass@xxxxxxxxxxxxxxxx> · Thu, 19 Dec 2024 08:38:20 +0100 (CET)

Hi everyone,

Just to make sure everyone reading this thread gets the info, setting osd_scrub_disable_reservation_queuing to 'true' is a temporary workaround, as confirmed by Laimis on the tracker [1].

Cheers,
Frédéric.

[1] https://tracker.ceph.com/issues/69078

----- Le 5 Déc 24, à 23:09, Laimis Juzeliūnas laimis.juzeliunas@xxxxxxxxxx a écrit :

> Hi all,
> 
> Just came back from this years Cephalocon and managed to get a quick chat with
> Ronen regarding this issue. He had a great presentation[1, 2] on the upcoming
> changes to scrubbing in Tentacle as well as some changes already made in Squid
> release.
> The primary suspect here is the mclock scheduler and the way replica
> reservations are made with since 19.2.0. Regular scrubs begin by the primary
> requesting all acting-set replicas to allow the scrub to continue, each replica
> either grants the request immediately or queues it. As I understand previous
> releases instead of queuing would send a simple deny on the spot in case of
> thinned resources (that happens when the scrub map is asked for from the acting
> set members, but I might be wrong). For some reason with mclock this can lead
> to acting sets constantly queuing these scrub requests and never actually
> completing.
> As for the configuraiton goes: in Squid osd_scrub_cost config that has been
> increased to 52428800 for some reason. I'm having a hard time finding previous
> values but [3] redhat docs have this value set at 50 << 20. Unless the whole
> logic/calculation has changed such an abyssmal value will simply never allow
> resources to be granted with mclock.
> Another suspect is osd_scrub_event_cost which has been set to 4096. Once again
> having a hard time to find any previous version values for it to compare.
> 
> One thing we've found that there is now a config option
> osd_scrub_disable_reservation_queuing (default - false): "When set - scrub
> replica reservations are responded to immediately, with either success or
> failure (the pre-Squid version behaviour). This configuration option is
> introduced to support mixed-version clusters and debugging, and will be removed
> in the next release." My guess is that setting this to true would simply return
> scrubbing options back to Reef and previous releases.
> 
> To keep all the work done with scrubbing changes in place we will try reducing
> osd_scrub_cost to a much lower value (50 or even less) and check if that helps
> our case. If not, we will reduce osd_scrub_event_cost as well as we're not sure
> at this point which one of these have the direct impact.
> If that wont help we will have to set osd_scrub_disable_reservation_queuing to
> true, but that will leave us simply with an old way scrubs are done (not cool -
> we want the fancy new way). If that wont help we will have to start thinking of
> switching to wpq instead of mclock, which is also not that cool looking into
> the future of Ceph.
> 
> I'll keep the mailing list (and tracker) updated with our findings.
> 
> Best,
> Laimis J.
> 
> 
> 1 -
> https://ceph2024.sched.com/event/1ktWh/the-scrub-type-to-limitations-matrix-ronen-friedman-ibm
> 2 - https://static.sched.com/hosted_files/ceph2024/08/ceph24_main%20%284%29.pdf
> 3 -
> https://docs.redhat.com/en/documentation/red_hat_ceph_storage/2/html/configuration_guide/osd_configuration_reference#scrubbing
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx