On Mon, Oct 2, 2023 at 4:10 AM Frederic Weisbecker <frederic@xxxxxxxxxx> wrote: > > Le Mon, Oct 02, 2023 at 12:32:41AM +0200, Frederic Weisbecker a écrit : > > Le Sun, Oct 01, 2023 at 07:57:14AM +0530, Neeraj upadhyay a écrit : > > > > > > But "more" only checks for CBs in DONE tail. The callbacks which have been just > > > accelerated are not advanced to DONE tail. > > > > > > Having said above, I am still trying to figure out, which callbacks > > > are actually being pointed > > > by NEXT tail. Given that __call_srcu() already does a advance and > > > accelerate, all enqueued > > > callbacks would be in either WAIT tail (the callbacks for current > > > active GP) or NEXT_READY > > > tail (the callbacks for next GP after current one completes). So, they > > > should already have > > > GP num assigned and srcu_invoke_callbacks() won't accelerate them. > > > Only case I can > > > think of is, if current GP completes after we sample > > > rcu_seq_current(&ssp->srcu_gp_seq) for > > > rcu_segcblist_advance() (so, WAIT tail cbs are not moved to DONE tail) > > > and a new GP is started > > > before we take snapshot ('s') of next GP for > > > rcu_segcblist_accelerate(), then the gp num 's' > > > > gp num of NEXT_READY_TAIL and will be put in NEXT tail. Not sure > > > if my understanding is correct here. Thoughts? > > > > > > __call_srcu() > > > rcu_segcblist_advance(&sdp->srcu_cblist, > > > rcu_seq_current(&ssp->srcu_gp_seq)); > > > s = rcu_seq_snap(&ssp->srcu_gp_seq); > > > (void)rcu_segcblist_accelerate(&sdp->srcu_cblist, s); > > > > Good point! This looks plausible. > > > > Does the (buggy) acceleration in srcu_invoke_callbacks() exists solely > > to handle that case? Because then the acceleration could be moved before > > the advance on callbacks handling, something like: > > > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > > index 20d7a238d675..af9d8af1d321 100644 > > --- a/kernel/rcu/srcutree.c > > +++ b/kernel/rcu/srcutree.c > > @@ -1245,6 +1245,11 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp, > > rcu_segcblist_advance(&sdp->srcu_cblist, > > rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq)); > > s = rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq); > > + /* > > + * Acceleration might fail if the preceding call to > > + * rcu_segcblist_advance() also failed due to a prior incomplete grace > > + * period. This should be later fixed in srcu_invoke_callbacks(). > > + */ > > (void)rcu_segcblist_accelerate(&sdp->srcu_cblist, s); > > if (ULONG_CMP_LT(sdp->srcu_gp_seq_needed, s)) { > > sdp->srcu_gp_seq_needed = s; > > @@ -1692,6 +1697,13 @@ static void srcu_invoke_callbacks(struct work_struct *work) > > ssp = sdp->ssp; > > rcu_cblist_init(&ready_cbs); > > spin_lock_irq_rcu_node(sdp); > > + /* > > + * Acceleration might have failed in srcu_gp_start_if_needed() if > > + * the preceding call to rcu_segcblist_advance() also failed due to > > + * a prior incomplete grace period. > > + */ > > + (void)rcu_segcblist_accelerate(&sdp->srcu_cblist, > > + sdp->srcu_gp_seq_needed); > > rcu_segcblist_advance(&sdp->srcu_cblist, > > rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq)); > > if (sdp->srcu_cblist_invoking || > > @@ -1720,8 +1732,6 @@ static void srcu_invoke_callbacks(struct work_struct *work) > > */ > > spin_lock_irq_rcu_node(sdp); > > rcu_segcblist_add_len(&sdp->srcu_cblist, -len); > > - (void)rcu_segcblist_accelerate(&sdp->srcu_cblist, > > - rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq)); > > sdp->srcu_cblist_invoking = false; > > more = rcu_segcblist_ready_cbs(&sdp->srcu_cblist); > > spin_unlock_irq_rcu_node(sdp); > > And if this works, can we then remove srcu_invoke_callbacks() self-requeue? > If queued several times before it actually fires, it will catch the latest > grace period's end. And if queued while the callback runs, it will re-run. > This makes sense, but not sure for non-wq context which link [1] mentions, whether it needs it. > Also why do we have sdp->srcu_invoke_callbacks ? Is that workqueue re-entrant? > I think you mean sdp->srcu_cblist_invoking ? There was a prior discussion on this [1], where Paul mentions about non-wq context. Thanks Neeraj [1] https://lkml.org/lkml/2020/11/19/1065 > Thanks.