On Sat, May 01, 2021 at 08:29:48PM +0300, Mykola Golub wrote: > On Thu, Apr 22, 2021 at 04:16:34PM +0300, Mykola Golub wrote: > > > I would like to bring some attention to a problem we have been > > observing with nautilus, and which I reported here [1]. > > > > If a pg is in backfill_unfound state ("unfound" objects were detected > > during backfill), and one of the osds from the active set is restarted > > the state changes to clean, losing the information about unfound > > objects. > > > > And when I tired to reproduce the issue on the master with the same > > scenario, the status did not change, but I was observing the primary > > osd crash after a non-primary restart. > > Ok. Now I seem to have better understanding what is going on here. > > As I wrote in [1], when `PrimaryLogPG::on_failed_pull` is called when > the object is not found on the backfill source osd, the oid is removed > from `backfills_in_flight` only if the backfill source is primary [2]. > In our case we are backfilling a non-primary EC shard, so the oid is > not removed from `backfills_in_flight`. And later it causes the > assertion failure in `PrimaryLogPG::_clear_recovery_state`. > > The behavior seemed to be changed during post-nautilus refactoring, in > [3]. Previously for the EC backend the oid was removed from > `backfills_in_flight` unconditionally, and now it is removed only if > the source is primary. > > In [1] I questioned this change, but after investigating how it works, > now it looks quite reasonable to me. > > So, the current behavior is: In `PrimaryLogPG::recover_backfill`, due > to the "unfound" oid is not removed from `backfills_in_flight`, > `next_backfill_to_complete` is always set to the "unfound" oid [4], > and `new_last_backfill` is not updated any more pointing to the object > before the "unfound" oid. The backfill still continues and terminates > only after all objects are pulled/pushed, but "complete" position > remains on the object before "unfound". After the backfill is finished > the pg enters "backfill_unfound" state. When the pg is re-peered > (e.g. after restarting an osd) it enters "backfilling" state starting > the backfill from "unfound" oid position, detects the "unfound" object > again, scans the remaining objects detecting they are already copied, > and enters "backfill_unfound" state again with the same "complete" > position on the "unfound" object. > > This looks like a reasonable behavoir to me, and the only problem is > that reported assertion failure, which probably is just needed to be > removed? > > In Nautilus, because the "unfound" oid is removed from > `backfills_in_flight`, the "complete" position is not stopped on this > oid, and when the backfill is finished it also enters > "backfill_unfound" state, but "complete" backfill postion is at the > end now. So when the pg is re-peered, the backfill is not re-started > from "unfound" position, the "unfound" object is not detected and the > pg enters "clean" state. > > If my understanding is correct, it looks like we have to: > > 1) in master, fix the assertion failure, probably by just removing the > assertion, and backport the fix. https://github.com/ceph/ceph/pull/41270 > > 2) in nautilus (direct commit), make the EC backend not remove > "unfound" oid from `backfills_in_flight` to have post-nautilus > behavior. https://github.com/ceph/ceph/pull/41293 > > Does it make sense? > > [1] https://tracker.ceph.com/issues/50351#note-1 > [2] https://github.com/ceph/ceph/blob/813933f81e3d682a0b1ae6dd906e38e78c4859a4/src/osd/PrimaryLogPG.cc#L12453 > [3] https://github.com/ceph/ceph/commit/8a8947d2a32d6390cb17099398e7f2212660c9a1 > [4] https://github.com/ceph/ceph/blob/813933f81e3d682a0b1ae6dd906e38e78c4859a4/src/osd/PrimaryLogPG.cc#L14010 > > > > > [1] https://tracker.ceph.com/issues/50351 > > -- > Mykola Golub -- Mykola Golub _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx