Re: The last 15 'degraded' items take as many hours as the first 15K?

Janne Johansson <icepic.dz@xxxxxxxxx> · Thu, 12 May 2022 09:05:15 +0200

Den tors 12 maj 2022 kl 00:03 skrev Harry G. Coin <hgcoin@xxxxxxxxx>:
> Might someone explain why the count of degraded items can drop
> thousands, sometimes tens of thousands in the same number of hours it
> takes to go from 10 to 0?  For example, when an OSD or a host with a few
> OSD's goes offline for a while, reboots.
>
> Sitting at one complete and entire degraded object out of millions for
> longer than it took to write this post.
>
> Seems the fewer the number of degraded objects, the less interested the
> cluster is in fixing it!

If (which is likely) different PGs take a different amount of time/IO
to recover based on size, or amount of metadata attached to it and so
on, then it would probably
be the case that some of the PGs you see early on as part of the "35
PGs are backfilling" contain the slow ones but also the faster ones
too, where the faster ones are replaced over as they finish. When all
the easy work is done, only the slow ones remain, making it look like
it waited until the end and then "don't want to work as hard on those
as the first ones" when in fact the sum of work was always going to
take a long time. (we had SMR drives on gig-eth boxes, when one of
those crashed it took .. aaaages to fix). It's just that the easy
parts pass by very fast due to the parallelism in the repairs, leaving
you to see the hard parts but they were never equal to begin with.

-- 
May the most significant bit of your life be positive.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx