On Tue, Mar 02, 2021 at 05:15:57PM -0800, Paul E. McKenney wrote: > The first question is of course: Did you try this with lockdep enabled? ;-) Yep I always do. But I may miss some configs on my testings. I usually test at least TREE01 on x86 and arm64. > > @@ -1702,43 +1692,50 @@ bool rcu_is_nocb_cpu(int cpu) > > return false; > > } > > > > -/* > > - * Kick the GP kthread for this NOCB group. Caller holds ->nocb_lock > > - * and this function releases it. > > - */ > > -static bool wake_nocb_gp(struct rcu_data *rdp, bool force, > > - unsigned long flags) > > - __releases(rdp->nocb_lock) > > +static bool __wake_nocb_gp(struct rcu_data *rdp_gp, > > + struct rcu_data *rdp, > > + bool force, unsigned long flags) > > + __releases(rdp_gp->nocb_gp_lock) > > { > > bool needwake = false; > > - struct rcu_data *rdp_gp = rdp->nocb_gp_rdp; > > > > - lockdep_assert_held(&rdp->nocb_lock); > > if (!READ_ONCE(rdp_gp->nocb_gp_kthread)) { > > - rcu_nocb_unlock_irqrestore(rdp, flags); > > + raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags); > > trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, > > TPS("AlreadyAwake")); > > return false; > > } > > > > - if (READ_ONCE(rdp->nocb_defer_wakeup) > RCU_NOCB_WAKE_NOT) { > > - WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT); > > - del_timer(&rdp->nocb_timer); > > + if (rdp_gp->nocb_defer_wakeup > RCU_NOCB_WAKE_NOT) { > > So there are no longer any data races involving ->nocb_defer_wakeup? > > (Yes, I could fire up KCSAN, but my KCSAN-capable system is otherwise > occupied for several more hours.) To be more specific, there is no more unlocked write to the timer (queue/cancel) and its nocb_defer_wakeup matching state. And there is only one (on purpose) racy reader of ->nocb_defer_wakeup which is the non-timer deferred wakeup. So the writes to the timer keep their WRITE_ONCE() and only the reader in do_nocb_deferred_wakeup() keeps its READ_ONCE(). Other readers are protected by the ->nocb_gp_lock. > > + > > // Advance callbacks if helpful and low contention. > > needwake_gp = false; > > if (!rcu_segcblist_restempty(&rdp->cblist, > > @@ -2178,11 +2182,18 @@ static void nocb_gp_wait(struct rcu_data *my_rdp) > > my_rdp->nocb_gp_bypass = bypass; > > my_rdp->nocb_gp_gp = needwait_gp; > > my_rdp->nocb_gp_seq = needwait_gp ? wait_gp_seq : 0; > > - if (bypass && !rcu_nocb_poll) { > > - // At least one child with non-empty ->nocb_bypass, so set > > - // timer in order to avoid stranding its callbacks. > > + if (bypass) { > > raw_spin_lock_irqsave(&my_rdp->nocb_gp_lock, flags); > > - mod_timer(&my_rdp->nocb_bypass_timer, j + 2); > > + // Avoid race with first bypass CB. > > + if (my_rdp->nocb_defer_wakeup > RCU_NOCB_WAKE_NOT) { > > + WRITE_ONCE(my_rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT); > > + del_timer(&my_rdp->nocb_timer); > > + } > > Given that the timer does not get queued if rcu_nocb_poll, why not move the > above "if" statement under the one following? It's done later in the set. > > > + if (!rcu_nocb_poll) { > > + // At least one child with non-empty ->nocb_bypass, so set > > + // timer in order to avoid stranding its callbacks. > > + mod_timer(&my_rdp->nocb_bypass_timer, j + 2); > > + } > > raw_spin_unlock_irqrestore(&my_rdp->nocb_gp_lock, flags); > > } > > if (rcu_nocb_poll) { > > @@ -2385,7 +2399,10 @@ static void do_nocb_deferred_wakeup_timer(struct timer_list *t) > > */ > > static bool do_nocb_deferred_wakeup(struct rcu_data *rdp) > > { > > - if (rcu_nocb_need_deferred_wakeup(rdp)) > > + if (!rdp->nocb_gp_rdp) > > + return false; > > This check was not necessary previously because each CPU used its own rdp, > correct? Exactly! > The theory is that this early return is taken only during boot, > and that the spawning of the kthreads will act as an implicit wakeup? You guessed right! That probably deserve a comment. Thanks!