On Thu, Jun 08, 2017 at 01:11:48PM -0700, Krister Johansen wrote: > Hi Paul, > > On Thu, May 25, 2017 at 02:59:18PM -0700, Paul E. McKenney wrote: > > Wait/wakeup operations do not guarantee ordering on their own. Instead, > > either locking or memory barriers are required. This commit therefore > > adds memory barriers to wake_nocb_leader() and nocb_leader_wait(). > > > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> > > --- > > kernel/rcu/tree_plugin.h | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h > > index 0b1042545116..573fbe9640a0 100644 > > --- a/kernel/rcu/tree_plugin.h > > +++ b/kernel/rcu/tree_plugin.h > > @@ -1810,6 +1810,7 @@ static void wake_nocb_leader(struct rcu_data *rdp, bool force) > > if (READ_ONCE(rdp_leader->nocb_leader_sleep) || force) { > > /* Prior smp_mb__after_atomic() orders against prior enqueue. */ > > WRITE_ONCE(rdp_leader->nocb_leader_sleep, false); > > + smp_mb(); /* ->nocb_leader_sleep before swake_up(). */ > > swake_up(&rdp_leader->nocb_wq); > > } > > } > > @@ -2064,6 +2065,7 @@ static void nocb_leader_wait(struct rcu_data *my_rdp) > > * nocb_gp_head, where they await a grace period. > > */ > > gotcbs = false; > > + smp_mb(); /* wakeup before ->nocb_head reads. */ > > for (rdp = my_rdp; rdp; rdp = rdp->nocb_next_follower) { > > rdp->nocb_gp_head = READ_ONCE(rdp->nocb_head); > > if (!rdp->nocb_gp_head) > > May I impose upon you to CC this patch to stable, and tag it as fixing > abedf8e241? I ran into this on a production 4.9 branch. When I > debugged it, I discovered that it went all the way back to 4.6. The > tl;dr is that at least for some environments, the missed wakeup > manifests itself as a series of hung-task warnings to console and if I'm > unlucky it can also generate a hang that can block interactive logins > via ssh. Interesting! This is the first that I have heard that this was anything other than a theoretical bug. To the comment in your second URL, it is wise to recall that a seismologist was in fact arrested for failing to predict an earthquake. Later acquitted/pardoned/whatever, but arrested nonetheless. ;-) https://www.theguardian.com/world/2012/oct/23/jailing-italian-seismologists-scientific-community Silliness aside, does my patch actually fix your problem in practice as well as in theory? If so, may I have your Tested-by? Impressive investigative effort, by the way! Thanx, Paul