Re: [PATCHv2 1/3] rcu: Keep qsmaskinitnext fresh when rcutree_online_cpu()

Pingfan Liu <kernelfans@xxxxxxxxx> · Mon, 19 Sep 2022 18:24:21 +0800

On Fri, Sep 16, 2022 at 10:52 PM Frederic Weisbecker
<frederic@xxxxxxxxxx> wrote:
>
> On Thu, Sep 15, 2022 at 01:58:23PM +0800, Pingfan Liu wrote:
> > rcutree_online_cpu() is concurrent, so there is the following race
> > scene:
> >
> >         CPU 1                               CPU2
> > mask_old = rcu_rnp_online_cpus(rnp);
> > ...
> >
> >                                    mask_new = rcu_rnp_online_cpus(rnp);
> >                                    ...
> >                                    set_cpus_allowed_ptr(t, cm);
> >
> > set_cpus_allowed_ptr(t, cm);
> >
> > Consequently, the old mask will overwrite the new mask in the task's cpus_ptr.
> >
> > Since there is a mutex ->boost_kthread_mutex, using it to build an
> > order, then the latest ->qsmaskinitnext will be fetched for updating cpus_ptr.
> >
> > But for concurrent offlining, ->qsmaskinitnext is not reilable when
> > rcutree_offline_cpu(). That is another story and comes in the following
> > patch.
> >
> > Signed-off-by: Pingfan Liu <kernelfans@xxxxxxxxx>
> > Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxx>
> > Cc: David Woodhouse <dwmw@xxxxxxxxxxxx>
> > Cc: Frederic Weisbecker <frederic@xxxxxxxxxx>
> > Cc: Neeraj Upadhyay <quic_neeraju@xxxxxxxxxxx>
> > Cc: Josh Triplett <josh@xxxxxxxxxxxxxxxx>
> > Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
> > Cc: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> > Cc: Lai Jiangshan <jiangshanlai@xxxxxxxxx>
> > Cc: Joel Fernandes <joel@xxxxxxxxxxxxxxxxx>
> > Cc: "Jason A. Donenfeld" <Jason@xxxxxxxxx>
> > To: rcu@xxxxxxxxxxxxxxx
> > ---
> >  kernel/rcu/tree_plugin.h | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > index 438ecae6bd7e..ef6d3ae239b9 100644
> > --- a/kernel/rcu/tree_plugin.h
> > +++ b/kernel/rcu/tree_plugin.h
> > @@ -1224,7 +1224,7 @@ static void rcu_spawn_one_boost_kthread(struct rcu_node *rnp)
> >  static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu)
> >  {
> >       struct task_struct *t = rnp->boost_kthread_task;
> > -     unsigned long mask = rcu_rnp_online_cpus(rnp);
> > +     unsigned long mask;
> >       cpumask_var_t cm;
> >       int cpu;
> >
> > @@ -1233,6 +1233,11 @@ static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu)
> >       if (!zalloc_cpumask_var(&cm, GFP_KERNEL))
> >               return;
> >       mutex_lock(&rnp->boost_kthread_mutex);
> > +     /*
> > +      * Relying on the lock to serialize, so when onlining, the latest
> > +      * qsmaskinitnext is for cpus_ptr.
> > +      */
> > +     mask = rcu_rnp_online_cpus(rnp);
> >       for_each_leaf_node_possible_cpu(rnp, cpu)
> >               if ((mask & leaf_node_cpu_bit(rnp, cpu)) &&
> >                   cpu != outgoingcpu)
>
> Right but you still race against concurrent rcu_report_dead() doing:
>
>       WRITE_ONCE(rnp->qsmaskinitnext, rnp->qsmaskinitnext & ~mask)
>

Ah. Indeed, my commit log is not precisely described.

In fact, either speedup smp_init [1] or fast kexec reboot [2] still
uses the model: one hotplug initiator and several reactors.  The
initiators are still excluded from each other by the pair
cpu_maps_update_begin()/cpu_maps_update_done(). So there can not be a
cpu-up and cpu-down event at the same time.

Thanks,

    Pingfan