RE: [PATCH] rcu/tree: Improve comments in rcu_report_qs_rdp()

"Zhang, Qiang1" <qiang1.zhang@xxxxxxxxx> · Sat, 11 Feb 2023 11:18:10 +0000

>
> On Sat, Feb 04, 2023 at 02:20:50AM +0000, Joel Fernandes (Google) wrote:
> > Recent discussion triggered due to a patch linked below, from Qiang,
> > shed light on the need to accelerate from QS reporting paths.
> >
> > Update the comments to capture this piece of knowledge.
> >
> > Link: https://lore.kernel.org/all/20230118073014.2020743-1-qiang1.zhang@xxxxxxxxx/
> > Cc: Qiang Zhang <Qiang1.zhang@xxxxxxxxx>
> > Cc: Frederic Weisbecker <frederic@xxxxxxxxxx>
> > Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
> >
> > ---
> >  kernel/rcu/tree.c | 13 ++++++++++++-
> >  1 file changed, 12 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 93eb03f8ed99..713eb6ca6902 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -1983,7 +1983,12 @@ rcu_report_qs_rdp(struct rcu_data *rdp)
> >       } else {
> >               /*
> >                * This GP can't end until cpu checks in, so all of our
> > -              * callbacks can be processed during the next GP.
> > +              * callbacks can be processed during the next GP. Do
> > +              * the acceleration from here otherwise there may be extra
> > +              * grace period delays, as any accelerations from rcu_core()
> > +              * or note_gp_changes() may happen only after the GP after the
> > +              * current one has already started. Further, rcu_core()
> > +              * only accelerates if RCU is idle (no GP in progress).
>
> Actually note_gp_changes() should take care of that.
>
>You are referring to  rcu_core() -> rcu_check_quiescent_state() ->
>note_gp_changes() doing the acceleration prior to the  rcu_core() ->
>rcu_report_qs_rdp() call, correct?
>
>Ah, but note_gp_changes() has an early return which triggers if either:
>1. The rnp spinlock trylock failed.
>2. The start of a new grace period was already detected before, so
>rdp->gp_seq == rnp->gp_seq.
>
>So I think it is possible that we are in the middle of a GP, and
>rcu_core() is called because QS reporting is required for the CPU, and
>say the current GP started we are in the middle off occurs from the
>same CPU so rdp->gp_seq == rnp->gp_seq.
>
>Now, rcu_core()'s call to note_gp_changes() should return early but
>its later call to report_qs_rdp() will not accelerate the callback
>without the code we are commenting here.
>
> My gut feeling is that the
> acceleration in rcu_report_qs_rdp() only stands for:
>
> * callbacks that may be enqueued from an IRQ firing during the small window
>   between the RNP unlock in note_gp_changes() and the RNP lock in
>   rcu_report_qs_rdp()

For rdp which is in the middle of a de-offloading process, the bypass list have been
flushed, the nocb kthreads may miss callbacks acceleration.   invoke call_rcu()
will also not use bypass list. if at this time a new gp starts, before call rcu_report_qs_rdp()
to report qs,  even if rcu_core() invoke note_gp_changes() notice gp start, this rdp's callback
may still miss acceleration if rdp still in de-offloading process, because invoke rcu_rdp_is_offloaded()
still return true.

I think this is also a reason.

Thanks
Zqiang

>
>Sure, this also seems like a valid reason.
>
> * __note_gp_changes() got called even before from the GP kthread, and callbacks
>   got enqueued between that and rcu_core().
>
>Agreed. In this case we will take the early return in
>note_gp_changes() when called from the rcu_core(). So yeah, that was
>kind of my point as well but slightly different reasoning.
>
>Let me know if you disagree with anything I mentioned, though.
>
> - Joel