On 9/20/22 05:23, Frederic Weisbecker wrote: > On Tue, Sep 20, 2022 at 11:45:45AM +0800, Pingfan Liu wrote: >> On Mon, Sep 19, 2022 at 12:51:49PM +0200, Frederic Weisbecker wrote: >>> On Mon, Sep 19, 2022 at 06:24:21PM +0800, Pingfan Liu wrote: >>>> On Fri, Sep 16, 2022 at 10:52 PM Frederic Weisbecker >>>> <frederic@xxxxxxxxxx> wrote: >>>>>> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h >>>>>> index 438ecae6bd7e..ef6d3ae239b9 100644 >>>>>> --- a/kernel/rcu/tree_plugin.h >>>>>> +++ b/kernel/rcu/tree_plugin.h >>>>>> @@ -1224,7 +1224,7 @@ static void rcu_spawn_one_boost_kthread(struct rcu_node *rnp) >>>>>> static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu) >>>>>> { >>>>>> struct task_struct *t = rnp->boost_kthread_task; >>>>>> - unsigned long mask = rcu_rnp_online_cpus(rnp); >>>>>> + unsigned long mask; >>>>>> cpumask_var_t cm; >>>>>> int cpu; >>>>>> >>>>>> @@ -1233,6 +1233,11 @@ static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu) >>>>>> if (!zalloc_cpumask_var(&cm, GFP_KERNEL)) >>>>>> return; >>>>>> mutex_lock(&rnp->boost_kthread_mutex); >>>>>> + /* >>>>>> + * Relying on the lock to serialize, so when onlining, the latest >>>>>> + * qsmaskinitnext is for cpus_ptr. >>>>>> + */ >>>>>> + mask = rcu_rnp_online_cpus(rnp); >>>>>> for_each_leaf_node_possible_cpu(rnp, cpu) >>>>>> if ((mask & leaf_node_cpu_bit(rnp, cpu)) && >>>>>> cpu != outgoingcpu) >>>>> >>>>> Right but you still race against concurrent rcu_report_dead() doing: >>>>> >>>>> WRITE_ONCE(rnp->qsmaskinitnext, rnp->qsmaskinitnext & ~mask) >>>>> >>>> >>>> Ah. Indeed, my commit log is not precisely described. >>>> >>>> In fact, either speedup smp_init [1] or fast kexec reboot [2] still >>>> uses the model: one hotplug initiator and several reactors. The >>>> initiators are still excluded from each other by the pair >>>> cpu_maps_update_begin()/cpu_maps_update_done(). So there can not be a >>>> cpu-up and cpu-down event at the same time. >>> >>> Yes but two downing CPUs may race right? >>> >>> So isn't it possible to have rcu_report_dead() racing against >>> rcutree_offline_cpu()? >>> >> >> Yes, two downing cpus can be in a batch. >> >> There is a nuance between onlining and offlining: >> -1. onlining relies on qsmaskinitnext, which is stable at the point >> rcutree_online_cpu(). And it is what this patch aims for. >> -2. offlining relies on cpu_dying_mask [2/3], instead of qsmaskinitnext >> which is not cleared yet at the point rcutree_offline_cpu(). >> >> It is asymmetric owning to the point of updating qsmaskinitnext. >> >> Now considering the racing between rcu_report_dead() and >> rcutree_offline_cpu(): >> >> No matter rcutree_offline_cpu() reads out either stale or fresh value of >> qsmaskinitnext, it should intersect with cpu_dying_mask. > > Ah I see now, so the race can happen but it is then cancelled by the > stable READ on cpu_dying_mask. Ok. > Maybe I am missing something but, concurrent rcu_report_dead() on CPU's of the same node can corrupt ->qsmaskinitnext as Frederick showed. That may not be a problem for affinities (due to the cpu_dying_mask), but is that not an issue for RCU operation itself? i.e. due to the corruption, ->qsmaskinitnext is in an inconsistent state causing *RCU* issues. Or is there a reason that that is not an issue? thanks, - Joel