Re: [PATCHv2 2/3] rcu: Resort to cpu_dying_mask for affinity when offlining

Pingfan Liu <kernelfans@xxxxxxxxx> · Tue, 20 Sep 2022 11:16:09 +0800

On Mon, Sep 19, 2022 at 12:34:32PM +0200, Frederic Weisbecker wrote:
> On Mon, Sep 19, 2022 at 12:33:23PM +0800, Pingfan Liu wrote:
> > On Fri, Sep 16, 2022 at 10:24 PM Frederic Weisbecker
> > > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > > > index ef6d3ae239b9..e5afc63bd97f 100644
> > > > --- a/kernel/rcu/tree_plugin.h
> > > > +++ b/kernel/rcu/tree_plugin.h
> > > > @@ -1243,6 +1243,12 @@ static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu)
> > > >                   cpu != outgoingcpu)
> > > >                       cpumask_set_cpu(cpu, cm);
> > > >       cpumask_and(cm, cm, housekeeping_cpumask(HK_TYPE_RCU));
> > > > +     /*
> > > > +      * For concurrent offlining, bit of qsmaskinitnext is not cleared yet.
> > > > +      * So resort to cpu_dying_mask, whose changes has already been visible.
> > > > +      */
> > > > +     if (outgoingcpu != -1)
> > > > +             cpumask_andnot(cm, cm, cpu_dying_mask);
> > >
> > > I'm not sure how the infrastructure changes in your concurrent down patchset
> > > but can the cpu_dying_mask concurrently change at this stage?
> > >
> > 
> > For the concurrent down patchset [1], it extends the cpu_down()
> > capability to let an initiator to tear down several cpus in a batch
> > and in parallel.
> > 
> > At the first step, all cpus to be torn down should experience
> > cpuhp_set_state(cpu, st, CPUHP_TEARDOWN_CPU), by that way, they are
> > set in the bitmap cpu_dying_mask [2]. Then the cpu hotplug kthread on
> > each teardown cpu can be kicked to work. (Indeed, [2] has a bug, and I
> > need to fix it by using another loop to call
> > cpuhp_kick_ap_work_async(cpu);)
> 
> So if I understand correctly there is a synchronization point for all
> CPUs between cpuhp_set_state() and CPUHP_AP_RCUTREE_ONLINE ?
> 

Yes, your understanding is right.

> And how about rollbacks through cpuhp_reset_state() ?
> 

Originally, cpuhp_reset_state() is not considered in my fast kexec
reboot series since at that point, all devices have been shutdown and
have no way to back. The rebooting just adventures to move on.

But yes as you point out, cpuhp_reset_state() throws a challenge to keep
the stability of cpu_dying_mask.

Considering we have the following order.
1.
  set_cpu_dying(true)
  rcutree_offline_cpu()
2. when rollback
  set_cpu_dying(false)
  rcutree_online_cpu()

The dying mask is stable before rcu routines, and
rnp->boost_kthread_mutex can be used to build a order to access the
latest cpu_dying_mask as in [1/3].

Thanks,

	Pingfan