On Mon, Oct 02, 2023 at 11:09:39PM +0200, Frederic Weisbecker wrote: > > > spin_unlock_rcu_node(sdp); /* Interrupts remain disabled. */ > > > WRITE_ONCE(ssp->srcu_sup->srcu_gp_start, jiffies); > > > WRITE_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay, 0); > > > @@ -1245,7 +1243,18 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp, > > > rcu_segcblist_advance(&sdp->srcu_cblist, > > > rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq)); > > > s = rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq); > > > - (void)rcu_segcblist_accelerate(&sdp->srcu_cblist, s); > > > + /* > > > + * Acceleration might fail if the preceding call to > > > + * rcu_segcblist_advance() also failed due to a prior grace > > > + * period seen incomplete before rcu_seq_snap(). If so then a new > > > + * call to advance will see the completed grace period and fix > > > + * the situation. > > > + */ > > > + if (!rcu_segcblist_accelerate(&sdp->srcu_cblist, s)) { > > > > We can add below also? Here old and new are rcu_seq_current() values used in > > the 2 calls to rcu_segcblist_advance(). > > > > WARN_ON_ONCE(!(rcu_seq_completed_gp(old, new) && rcu_seq_new_gp(old, new))); > > Very good point! "new" should be exactly one and a half grace period away from > "old", will add that. > > Cooking proper patches now. Actually this more simple fix below. rcu_seq_snap() can be called before rcu_segcblist_advance() after all. The only side effect is that callbacks advancing is then _after_ the full barrier in rcu_seq_snap(). I don't see an obvious problem with that as that barrier only cares about: 1) Ordering accesses of the update side before call_srcu() so they don't bleed 2) See all the accesses prior to the grace period of the current gp_num The only things callbacks advancing need to be ordered against are carried by snp locking. I still remove the accelerations elsewhere and advancing in srcu_gp_start() in further patches. I'll also avoid advancing and acceleration in srcu_gp_start_if_needed if there is no callback to queue. The point is also that this simple fix alone can be easily backported and the rest can come as cleanups. diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 5602042856b1..8b09fb37dbf3 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -1244,10 +1244,10 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp, spin_lock_irqsave_sdp_contention(sdp, &flags); if (rhp) rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp); + s = rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq); rcu_segcblist_advance(&sdp->srcu_cblist, rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq)); - s = rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq); - (void)rcu_segcblist_accelerate(&sdp->srcu_cblist, s); + WARN_ON_ONCE(!rcu_segcblist_accelerate(&sdp->srcu_cblist, s) && rhp); if (ULONG_CMP_LT(sdp->srcu_gp_seq_needed, s)) { sdp->srcu_gp_seq_needed = s; needgp = true;