Re: [ceph-users] EC Backfill Observations

Josh Baergen <jbaergen@xxxxxxxxxxxxxxxx> · Wed, 21 Apr 2021 10:29:54 -0600

Hey Josh,

Thanks for the info!

> With respect to reservations, it seems like an oversight that
> we don't reserve other shards for backfilling. We reserve all
> shards for recovery [0].

Very interesting that there is a reservation difference between
backfill and recovery.

> On the other hand, overload from recovery is handled better in
> pacific and beyond with mclock-based QoS, which provides much
> more effective control of recovery traffic [1][2].

Indeed, I was wondering if mclock was ultimately the answer here,
though I wonder how mclock acts in the case where a source OSD gets
overloaded in the way that I described. Will it throttle backfill too
aggressively, for example, compared to if the reservation was in
place, preventing overload in the first place?

One more question in this space: Has there ever been discussion about
a back-off mechanism when one of the remote reservations is blocked?
Another issue that we've commonly seen is that a backfill that should
be able to make progress can't because of a backfill_wait that holds
some of its reservations but is waiting for others. Example (with
simplified up/acting sets):

    1.1  active+remapped+backfilling   [0,2]  0   [0,1]  0
    1.2  active+remapped+backfill_wait   [3,2]  3   [3,1]  3
    1.3  active+remapped+backfill_wait   [3,5]  3   [3,4]  3

1.3's backfill could make progress independent of 1.1, but is blocked
behind 1.2 because the latter is holding the local reservation on
osd.3 and is waiting for the remote reservation on osd.2.

Josh
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx