On Mon, Sep 19, 2022 at 12:33:23PM +0800, Pingfan Liu wrote: > On Fri, Sep 16, 2022 at 10:24 PM Frederic Weisbecker > > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h > > > index ef6d3ae239b9..e5afc63bd97f 100644 > > > --- a/kernel/rcu/tree_plugin.h > > > +++ b/kernel/rcu/tree_plugin.h > > > @@ -1243,6 +1243,12 @@ static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu) > > > cpu != outgoingcpu) > > > cpumask_set_cpu(cpu, cm); > > > cpumask_and(cm, cm, housekeeping_cpumask(HK_TYPE_RCU)); > > > + /* > > > + * For concurrent offlining, bit of qsmaskinitnext is not cleared yet. > > > + * So resort to cpu_dying_mask, whose changes has already been visible. > > > + */ > > > + if (outgoingcpu != -1) > > > + cpumask_andnot(cm, cm, cpu_dying_mask); > > > > I'm not sure how the infrastructure changes in your concurrent down patchset > > but can the cpu_dying_mask concurrently change at this stage? > > > > For the concurrent down patchset [1], it extends the cpu_down() > capability to let an initiator to tear down several cpus in a batch > and in parallel. > > At the first step, all cpus to be torn down should experience > cpuhp_set_state(cpu, st, CPUHP_TEARDOWN_CPU), by that way, they are > set in the bitmap cpu_dying_mask [2]. Then the cpu hotplug kthread on > each teardown cpu can be kicked to work. (Indeed, [2] has a bug, and I > need to fix it by using another loop to call > cpuhp_kick_ap_work_async(cpu);) So if I understand correctly there is a synchronization point for all CPUs between cpuhp_set_state() and CPUHP_AP_RCUTREE_ONLINE ? And how about rollbacks through cpuhp_reset_state() ? Thanks.