Re: [PATCH RFC rcu] Fix get_state_synchronize_rcu_full() GP-start detection

Frederic Weisbecker <frederic@xxxxxxxxxx> · Fri, 24 Jan 2025 15:56:19 +0100

Le Thu, Jan 23, 2025 at 08:49:47PM -0500, Joel Fernandes a écrit :
> On Thu, Dec 12, 2024 at 7:59 PM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> >
> > The get_state_synchronize_rcu_full() and poll_state_synchronize_rcu_full()
> > functions use the root rcu_node structure's ->gp_seq field to detect
> > the beginnings and ends of grace periods, respectively.  This choice is
> > necessary for the poll_state_synchronize_rcu_full() function because
> > (give or take counter wrap), the following sequence is guaranteed not
> > to trigger:
> >
> >         get_state_synchronize_rcu_full(&rgos);
> >         synchronize_rcu();
> >         WARN_ON_ONCE(!poll_state_synchronize_rcu_full(&rgos));
> >
> > The RCU callbacks that awaken synchronize_rcu() instances are
> > guaranteed not to be invoked before the root rcu_node structure's
> > ->gp_seq field is updated to indicate the end of the grace period.
> > However, these callbacks might start being invoked immediately
> > thereafter, in particular, before rcu_state.gp_seq has been updated.
> > Therefore, poll_state_synchronize_rcu_full() must refer to the
> > root rcu_node structure's ->gp_seq field.  Because this field is
> > updated under this structure's ->lock, any code following a call to
> > poll_state_synchronize_rcu_full() will be fully ordered after the
> > full grace-period computation, as is required by RCU's memory-ordering
> > semantics.
> >
> > By symmetry, the get_state_synchronize_rcu_full() function should also
> > use this same root rcu_node structure's ->gp_seq field.  But it turns out
> > that symmetry is profoundly (though extremely infrequently) destructive
> > in this case.  To see this, consider the following sequence of events:
> >
> > 1.      CPU 0 starts a new grace period, and updates rcu_state.gp_seq
> >         accordingly.

I don't think so because idle CPUs are waited upon to report a QS, unlike
offline CPUs that don't appear in ->qsmaskinitnext.

If the CPU 1 is idle while the grace period kthread scans its
ct_rcu_watching_cpu(), then the QS is reported on its behalf and when CPU 1
goes out of idle it is guaranteed to see the new started GP on the root node.

If the CPU 1 is not idle while the grace period kthread scans its
ct_rcu_watching_cpu(), then CPU 1 must report a QS and that cancels the race.

Thanks.