On Wed, Sep 2, 2020 at 11:38 PM Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote: > > Hi Joao & Kefu, > > I have a question about > https://github.com/ceph/ceph/commit/e62269c8929e414284ad0773c4a3c82e43735e4e > which was backported and released into v14.2.10. > > My understanding is that the intention was to ignore the osd_epoch of > down osds, so that we can trim osdmaps up to the min of (a) the lowest > per-pool clean epoch and (b) the lowest clean epoch of all up osds. > (See [1] and [2] for motivation). > > Before this commit, get_min_last_epoch_clean would loop over *all* osd > epochs and lower the floor if needed. > > Now after the commit we only check the epochs of the *out* osds. > Isn't that logic inverted? Shouldn't we be looping over all the *in* osds? [3] > > This commit has passed by many eyes already so I must be confused... > Please help :-/ hi Dan, thanks for pointing this out! indeed! i created https://tracker.ceph.com/issues/47290 to track this issue. and i will create a fix based on your one-liner change. > > (I ask because we already have evidence running 14.2.11 that maps are > still not trimmed when we mark out a broken osd -- we had to restart > the mon leader to provoke the trimming). > > Thanks, > > Dan > > [1] https://tracker.ceph.com/issues/37875#note-6 > [2] https://lists.ceph.io/hyperkitty/list/dev@xxxxxxx/thread/6KSOLVLWR6HZOVUY7USPPPL7JBHDX7JA/ > > [3] > @@ -2251,7 +2251,7 @@ epoch_t OSDMonitor::get_min_last_epoch_clean() const > // don't trim past the oldest reported osd epoch > for (auto [osd, epoch] : osd_epochs) { > if (epoch < floor && > - osdmap.is_out(osd)) { > + osdmap.is_in(osd)) { > floor = epoch; > } > } > _______________________________________________ > Dev mailing list -- dev@xxxxxxx > To unsubscribe send an email to dev-leave@xxxxxxx -- Regards Kefu Chai _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx