On Sat, Dec 16, 2023 at 4:17 PM Frederic Weisbecker <frederic@xxxxxxxxxx> wrote: > > Le Mon, Dec 11, 2023 at 01:57:16AM +0000, Joel Fernandes (Google) a écrit : > > The comments added in commit 1ef990c4b36b ("srcu: No need to > > advance/accelerate if no callback enqueued") are a bit confusing to me. > > I know some maintainers who may argue that in the changelog world, the first > person doesn't exist :-) Heh, that's fair. Ok I can drop the 'to me'. ;-) > > > The comments are describing a scenario for code that was moved and is > > no longer the way it was (snapshot after advancing). Improve the code > > comments to reflect this and also document by acceleration can never > > s/by/why Ok. > > fail. > > > > Cc: Frederic Weisbecker <frederic@xxxxxxxxxx> > > Cc: Neeraj Upadhyay <neeraj.iitr10@xxxxxxxxx> > > Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx> > > --- > > v1->v2: Fix typo in change log. > > > > kernel/rcu/srcutree.c | 24 ++++++++++++++++++++---- > > 1 file changed, 20 insertions(+), 4 deletions(-) > > > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > > index 0351a4e83529..051e149490d1 100644 > > --- a/kernel/rcu/srcutree.c > > +++ b/kernel/rcu/srcutree.c > > @@ -1234,11 +1234,20 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp, > > if (rhp) > > rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp); > > /* > > - * The snapshot for acceleration must be taken _before_ the read of the > > - * current gp sequence used for advancing, otherwise advancing may fail > > - * and acceleration may then fail too. > > + * It's crucial to capture the snapshot 's' for acceleration before > > + * reading the current gp_seq that is used for advancing. This is > > + * essential because if the acceleration snapshot is taken after a > > + * failed advancement attempt, there's a risk that a grace period may > > + * conclude and a new one may start in the interim. If the snapshot is > > + * captured after this sequence of events, the acceleration snapshot 's' > > + * could be excessively advanced, leading to acceleration failure. > > + * In such a scenario, an 'acceleration leak' can occur, where new > > + * callbacks become indefinitely stuck in the RCU_NEXT_TAIL segment. > > + * Also note that encountering advancing failures is a normal > > + * occurrence when the grace period for RCU_WAIT_TAIL is in progress. > > * > > - * This could happen if: > > + * To see this, consider the following events which occur if > > + * rcu_seq_snap() were to be called after advance: > > * > > * 1) The RCU_WAIT_TAIL segment has callbacks (gp_num = X + 4) and the > > * RCU_NEXT_READY_TAIL also has callbacks (gp_num = X + 8). > > @@ -1264,6 +1273,13 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp, > > if (rhp) { > > rcu_segcblist_advance(&sdp->srcu_cblist, > > rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq)); > > + /* > > + * Acceleration can never fail because the state of gp_seq used > > + * for advancing is <= the state of gp_seq used for > > + * acceleration. > > What do you mean by "state" here? State means "value at a certain point in time" here. > If it's the gp_seq number, that doesn't look right. Uff, I screwed up the comment. I swapped "acceleration" and "advancing". I should say: "Acceleration can never fail because the state of gp_seq value used for acceleration is <= the state of gp_seq used for advancing." Does that sound correct now? > The situation raising the initial bug also involved a gp_seq used for advancing <= the gp_seq used for acceleration. Right, which I understand is the bug. thanks, - Joel