Re: [PATCHv2 1/3] rcu: Keep qsmaskinitnext fresh when rcutree_online_cpu()

Pingfan Liu <kernelfans@xxxxxxxxx> · Sun, 2 Oct 2022 20:34:17 +0800

On Sat, Oct 1, 2022 at 10:26 AM Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
>
>
>
> On 9/20/22 05:23, Frederic Weisbecker wrote:
> > On Tue, Sep 20, 2022 at 11:45:45AM +0800, Pingfan Liu wrote:
> >> On Mon, Sep 19, 2022 at 12:51:49PM +0200, Frederic Weisbecker wrote:
> >>> On Mon, Sep 19, 2022 at 06:24:21PM +0800, Pingfan Liu wrote:
> >>>> On Fri, Sep 16, 2022 at 10:52 PM Frederic Weisbecker
> >>>> <frederic@xxxxxxxxxx> wrote:
> >>>>>> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> >>>>>> index 438ecae6bd7e..ef6d3ae239b9 100644
> >>>>>> --- a/kernel/rcu/tree_plugin.h
> >>>>>> +++ b/kernel/rcu/tree_plugin.h
> >>>>>> @@ -1224,7 +1224,7 @@ static void rcu_spawn_one_boost_kthread(struct rcu_node *rnp)
> >>>>>>  static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu)
> >>>>>>  {
> >>>>>>       struct task_struct *t = rnp->boost_kthread_task;
> >>>>>> -     unsigned long mask = rcu_rnp_online_cpus(rnp);
> >>>>>> +     unsigned long mask;
> >>>>>>       cpumask_var_t cm;
> >>>>>>       int cpu;
> >>>>>>
> >>>>>> @@ -1233,6 +1233,11 @@ static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu)
> >>>>>>       if (!zalloc_cpumask_var(&cm, GFP_KERNEL))
> >>>>>>               return;
> >>>>>>       mutex_lock(&rnp->boost_kthread_mutex);
> >>>>>> +     /*
> >>>>>> +      * Relying on the lock to serialize, so when onlining, the latest
> >>>>>> +      * qsmaskinitnext is for cpus_ptr.
> >>>>>> +      */
> >>>>>> +     mask = rcu_rnp_online_cpus(rnp);
> >>>>>>       for_each_leaf_node_possible_cpu(rnp, cpu)
> >>>>>>               if ((mask & leaf_node_cpu_bit(rnp, cpu)) &&
> >>>>>>                   cpu != outgoingcpu)
> >>>>>
> >>>>> Right but you still race against concurrent rcu_report_dead() doing:
> >>>>>
> >>>>>       WRITE_ONCE(rnp->qsmaskinitnext, rnp->qsmaskinitnext & ~mask)
> >>>>>
> >>>>
> >>>> Ah. Indeed, my commit log is not precisely described.
> >>>>
> >>>> In fact, either speedup smp_init [1] or fast kexec reboot [2] still
> >>>> uses the model: one hotplug initiator and several reactors.  The
> >>>> initiators are still excluded from each other by the pair
> >>>> cpu_maps_update_begin()/cpu_maps_update_done(). So there can not be a
> >>>> cpu-up and cpu-down event at the same time.
> >>>
> >>> Yes but two downing CPUs may race right?
> >>>
> >>> So isn't it possible to have rcu_report_dead() racing against
> >>> rcutree_offline_cpu()?
> >>>
> >>
> >> Yes, two downing cpus can be in a batch.
> >>
> >> There is a nuance between onlining and offlining:
> >> -1. onlining relies on qsmaskinitnext, which is stable at the point
> >> rcutree_online_cpu(). And it is what this patch aims for.
> >> -2. offlining relies on cpu_dying_mask [2/3], instead of qsmaskinitnext
> >> which is not cleared yet at the point rcutree_offline_cpu().
> >>
> >> It is asymmetric owning to the point of updating qsmaskinitnext.
> >>
> >> Now considering the racing between rcu_report_dead() and
> >> rcutree_offline_cpu():
> >>
> >> No matter rcutree_offline_cpu() reads out either stale or fresh value of
> >> qsmaskinitnext, it should intersect with cpu_dying_mask.
> >
> > Ah I see now, so the race can happen but it is then cancelled by the
> > stable READ on cpu_dying_mask. Ok.
> >
>
> Maybe I am missing something but, concurrent rcu_report_dead() on CPU's
> of the same node can corrupt ->qsmaskinitnext as Frederick showed. That

There is a pair of lock ops around WRITE_ONCE(rnp->qsmaskinitnext,
rnp->qsmaskinitnext & ~mask), i.e. raw_spin_lock_irqsave_rcu_node(rnp,
flags) and raw_spin_unlock_irqrestore_rcu_node(rnp, flags);

So it should not be a problem, right?

Thanks,

    Pingfan