Re: backfill_wait preventing deep scrubs

Mykola Golub <to.my.trociny@xxxxxxxxx> · Thu, 21 Sep 2023 17:53:15 +0300



On Thu, Sep 21, 2023 at 1:57 PM Frank Schilder <frans@xxxxxx> wrote:
>
> Hi all,
>
> I replaced a disk in our octopus cluster and it is rebuilding. I noticed that since the replacement there is no scrubbing going on. Apparently, an OSD having a PG in backfill_wait state seems to block deep scrubbing all other PGs on that OSD as well - at least this is how it looks.
>
> Some numbers: the pool in question has 8192 PGs with EC 8+3 and ca 850 OSDs. A total of 144 PGs needed backfilling (were remapped after replacing the disk). After about 2 days we are down to 115 backfill_wait + 3 backfilling. It will take a bit more than a week to complete.
>
> There is plenty of time and IOP/s available to deep-scrub PGs on the side, but since the backfill started there is zero scrubbing/deep scrubbing going on and "PGs not deep scrubbed in time" messages are piling up.
>
> Is there a way to allow (deep) scrub in this situation?

ceph config set osd osd_scrub_during_recovery true


-- 
Mykola Golub
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx