On Mon, Sep 05, 2022 at 11:38:52AM +0800, Pingfan Liu wrote: > rcutree_online_cpu() is concurrent, so there is the following race > scene: > > CPU 1 CPU2 > mask_old = rcu_rnp_online_cpus(rnp); > ... > > mask_new = rcu_rnp_online_cpus(rnp); > ... > set_cpus_allowed_ptr(t, cm); > > set_cpus_allowed_ptr(t, cm); > > Consequently, the old mask will overwrite the new mask in the task's cpus_ptr. > > Since there is a mutex ->boost_kthread_mutex, using it to build an > order, then the latest ->qsmaskinitnext will be fetched for updating cpus_ptr. > > Notes about the cpu teardown: The cpu hot-removing initiator executes > rcutree_dead_cpu() for the teardown cpu. If in future, an initiator can > hot-remove several cpus at a time, it executes rcutree_dead_cpu() in > series for each cpu. So it is still race-free without the mutex. But > considering this is a cold path, applying that redundant mutex is > harmless. > > Signed-off-by: Pingfan Liu <kernelfans@xxxxxxxxx> > Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxx> > Cc: David Woodhouse <dwmw@xxxxxxxxxxxx> > Cc: Frederic Weisbecker <frederic@xxxxxxxxxx> > Cc: Neeraj Upadhyay <quic_neeraju@xxxxxxxxxxx> > Cc: Josh Triplett <josh@xxxxxxxxxxxxxxxx> > Cc: Steven Rostedt <rostedt@xxxxxxxxxxx> > Cc: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> > Cc: Lai Jiangshan <jiangshanlai@xxxxxxxxx> > Cc: Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> > Cc: "Jason A. Donenfeld" <Jason@xxxxxxxxx> > To: rcu@xxxxxxxxxxxxxxx Good choice, thank you! Because I did not take your patch #2, I had to hand-apply this one. Please check the result below and let me know if I messed something up. Thanx, Paul ------------------------------------------------------------------------ commit 7139c19d2ea117976aa892de4fb75682e989ba12 Author: Pingfan Liu <kernelfans@xxxxxxxxx> Date: Tue Sep 6 11:36:42 2022 -0700 rcu: Synchronize ->qsmaskinitnext in rcu_boost_kthread_setaffinity() Once either rcutree_online_cpu() or rcutree_dead_cpu() is invoked concurrently, the following rcu_boost_kthread_setaffinity() race can occur: CPU 1 CPU2 mask = rcu_rnp_online_cpus(rnp); ... mask = rcu_rnp_online_cpus(rnp); ... set_cpus_allowed_ptr(t, cm); set_cpus_allowed_ptr(t, cm); This results in CPU2's update being overwritten by that of CPU1, and thus the possibility of ->boost_kthread_task continuing to run on a to-be-offlined CPU. This commit therefore eliminates this race by relying on the pre-existing acquisition of ->boost_kthread_mutex to serialize the full process of changing the affinity of ->boost_kthread_task. Signed-off-by: Pingfan Liu <kernelfans@xxxxxxxxx> Cc: David Woodhouse <dwmw@xxxxxxxxxxxx> Cc: Frederic Weisbecker <frederic@xxxxxxxxxx> Cc: Neeraj Upadhyay <quic_neeraju@xxxxxxxxxxx> Cc: Josh Triplett <josh@xxxxxxxxxxxxxxxx> Cc: Steven Rostedt <rostedt@xxxxxxxxxxx> Cc: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> Cc: Lai Jiangshan <jiangshanlai@xxxxxxxxx> Cc: Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> Cc: "Jason A. Donenfeld" <Jason@xxxxxxxxx> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index e3142ee35fc6a..7b0fe741a0886 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1221,11 +1221,13 @@ static void rcu_spawn_one_boost_kthread(struct rcu_node *rnp) * We don't include outgoingcpu in the affinity set, use -1 if there is * no outgoing CPU. If there are no CPUs left in the affinity set, * this function allows the kthread to execute on any CPU. + * + * Any future concurrent calls are serialized via ->boost_kthread_mutex. */ static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu) { struct task_struct *t = rnp->boost_kthread_task; - unsigned long mask = rcu_rnp_online_cpus(rnp); + unsigned long mask; cpumask_var_t cm; int cpu; @@ -1234,6 +1236,7 @@ static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu) if (!zalloc_cpumask_var(&cm, GFP_KERNEL)) return; mutex_lock(&rnp->boost_kthread_mutex); + mask = rcu_rnp_online_cpus(rnp); for_each_leaf_node_possible_cpu(rnp, cpu) if ((mask & leaf_node_cpu_bit(rnp, cpu)) && cpu != outgoingcpu)