Hmm, just noticed I never replied to this. You are correct and I was not reading carefully enough; the send_notify is only set to false on Activate. I still think something in the stack will block messages if they're from a too-new epoch, or that this is correct because of something else (we do a *lot* of OSD thrashing tests), but I didn't track down exactly how/why. -Greg On Wed, Nov 1, 2017 at 11:35 PM Xinze Chi (信泽) <xmdxcxz@xxxxxxxxx> wrote: > > The Stray set send_notify false only if go to activate? > > 2017-11-02 10:27 GMT+08:00 Gregory Farnum <gfarnum@xxxxxxxxxx>: > > On Wed, Nov 1, 2017 at 5:27 PM Xinze Chi (信泽) <xmdxcxz@xxxxxxxxx> wrote: > >> > >> 2017-11-02 4:26 GMT+08:00 Gregory Farnum <gfarnum@xxxxxxxxxx>: > >> > On Fri, Oct 27, 2017 at 12:46 AM Xinze Chi (信泽) <xmdxcxz@xxxxxxxxx> wrote: > >> >> > >> >> hi, all: > >> >> > >> >> I confuse about the notify message during peering. Such as: > >> >> > >> >> epoch 1, primary osd do Pering , GetInfo and GetMissing, it > >> >> calling the func proc_replica_log. in this func the last_complete and > >> >> last_update maybe reset. > >> >> > >> >> Before go to Activate. the OSDMap change (the new osdmap do not > >> >> lead to restart peering), the non-primary osd send the notify to > >> >> primary. > >> > > >> > > >> > I don't think this can happen. The OSD won't re-send a notify during > >> > the same peering interval, and even if it did the message would be > >> > tagged with a new (higher) epoch so the PG wouldn't process it until > >> > after it had switched states, right? > >> > > >> > >> I just want to understand this algorithm. When the Stray osd received ActMap > >> > >> it would send_notity even if during the same peering interval. see > >> Stray::react(const ActMap&). > > > > Note the > > > > if (pg->should_send_notify() > > > > check preceding that block. It checks a boolean send_notify value that > > is set true only when it enters a new peering interval, and is set > > false as soon as it shares its info. So I don't think the primary's > > behavior matters at all (other than from a security perspective, > > anyway). > > > > > >> You say the priamry osd wouldn't process the notify msg, I do not > >> find out the code. The primary > >> > >> call handle_pg_notify and process it. > > > > I didn't actually track the order of the state machine here; I just > > saw that PG::RecoveryState::Active::react(const MNotifyRec& notevt) > > will throw them out if it's already seen the info. You're right > > PG::RecoveryState::Primary::react(const MNotifyRec& notevt) will > > process it unconditionally. I'm not sure if those are the replica and > > primary states, or if you move from Primary to Active (or vice versa). > > -Greg > > > >> > >> > >> >> > >> >> > >> >> When the primary receive the notify, Primary::react(const > >> >> MNotifyRec& notevt), so it call the func proc_replica_info. > >> >> > >> >> In the func, we update the pg info including last_complete and > >> >> last_update which modified in proc_replica_log. > >> > > >> > Note also that "PG::RecoveryState::Active::react(const MNotifyRec& > >> > notevt)" does *not* unconditionally invoke proc_replica_info(). I > >> > think you were trying to say we hadn't reached this state on receipt > >> > of the message? But as I mentioned above, I think we block so that's > >> > not actually possible either. > >> > > >> >> > >> >> When the primary call the func activate, the primary osd process > >> >> recovering based on pg info got by notify instead of proc_replica_log. > >> >> > >> >> so it is a bug? > >> > > >> > Have you seen issues in the wild, or just trying to understand this > >> > code/algorithm? I would be surprised if we had undiscovered issues > >> > here just because our tests exercise peering quite vigorously, but I > >> > might be missing what's happening in my own code skims. > >> > -Greg > >> > >> > >> > >> -- > >> Regards, > >> Xinze Chi > > > > -- > Regards, > Xinze Chi -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html