On Thu, Dec 01, 2022 at 07:45:33AM +0800, Zqiang wrote: > Currently, invoke rcu_tasks_rude_wait_gp() to wait one rude > RCU-tasks grace period, if __num_online_cpus == 1, will return > directly, indicates the end of the rude RCU-task grace period. > suppose the system has two cpus, consider the following scenario: > > CPU0 CPU1 (going offline) > migration/1 task: > cpu_stopper_thread > -> take_cpu_down > -> _cpu_disable > (dec __num_online_cpus) > ->cpuhp_invoke_callback > preempt_disable > access old_data0 > task1 > del old_data0 ..... > synchronize_rcu_tasks_rude() > task1 schedule out > .... > task2 schedule in > rcu_tasks_rude_wait_gp() > ->__num_online_cpus == 1 > ->return > .... > task1 schedule in > ->free old_data0 > preempt_enable > > when CPU1 dec __num_online_cpus and __num_online_cpus is equal one, > the CPU1 has not finished offline, stop_machine task(migration/1) > still running on CPU1, maybe still accessing 'old_data0', but the > 'old_data0' has freed on CPU0. > > In order to prevent the above scenario from happening, this commit > remove check for __num_online_cpus == 0 and add handling of calling > synchronize_rcu_tasks_generic() during early boot(when the > rcu_scheduler_active variable is RCU_SCHEDULER_INACTIVE, the scheduler > not yet initialized and only one boot-CPU online). > > Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx> >Very good, thank you! I did the usual wordsmithing, including to that >error message, so as usual please check to make sure that I didn't mess >something up. > > Thanx, Paul > >------------------------------------------------------------------------ > >commit 033ddc5d337984e20b9d49c8af4faa4689727626 >Author: Zqiang <qiang1.zhang@xxxxxxxxx> >Date: Thu Dec 1 07:45:33 2022 +0800 > > rcu-tasks: Make rude RCU-Tasks work well with CPU hotplug > > The synchronize_rcu_tasks_rude() function invokes rcu_tasks_rude_wait_gp() > to wait one rude RCU-tasks grace period. The rcu_tasks_rude_wait_gp() > function in turn checks if there is only a single online CPU. If so, it > will immediately return, because a call to synchronize_rcu_tasks_rude() > is by definition a grace period on a single-CPU system. (We could > have blocked!) > > Unfortunately, this check uses num_online_cpus() without synchronization, > which can result in too-short grace periods. To see this, consider the > following scenario: > > CPU0 CPU1 (going offline) > migration/1 task: > cpu_stopper_thread > -> take_cpu_down > -> _cpu_disable > (dec __num_online_cpus) > ->cpuhp_invoke_callback > preempt_disable > access old_data0 > task1 > del old_data0 ..... > synchronize_rcu_tasks_rude() > task1 schedule out > .... > task2 schedule in > rcu_tasks_rude_wait_gp() > ->__num_online_cpus == 1 > ->return > .... > task1 schedule in > ->free old_data0 > preempt_enable > > When CPU1 decrements __num_online_cpus, its value becomes 1. However, > CPU1 has not finished going offline, and will take one last trip through > the scheduler and the idle loop before it actually stops executing > instructions. Because synchronize_rcu_tasks_rude() is mostly used for > tracing, and because both the scheduler and the idle loop can be traced, > this means that CPU0's prematurely ended grace period might disrupt the > tracing on CPU1. Given that this disruption might include CPU1 executing > instructions in memory that was just now freed (and maybe reallocated), > this is a matter of some concern. > > This commit therefore removes that problematic single-CPU check from the > rcu_tasks_rude_wait_gp() function. This dispenses with the single-CPU > optimization, but there is no evidence indicating that this optimization > is important. In addition, synchronize_rcu_tasks_generic() contains a > similar optimization (albeit only for early boot), which also splats. > (As in exactly why are you invoking synchronize_rcu_tasks_rude() so > early in boot, anyway???) > > It is OK for the synchronize_rcu_tasks_rude() function's check to be > unsynchronized because the only times that this check can evaluate to > true is when there is only a single CPU running with preemption > disabled. > > While in the area, this commit also fixes a minor bug in which a > call to synchronize_rcu_tasks_rude() would instead be attributed to > synchronize_rcu_tasks(). > > [ paulmck: Add "synchronize_" prefix and "()" suffix. ] > > Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx> > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx> > >diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h >index 4dda8e6e5707f..d845723c1af41 100644 >--- a/kernel/rcu/tasks.h >+++ b/kernel/rcu/tasks.h >@@ -560,8 +560,9 @@ static int __noreturn rcu_tasks_kthread(void *arg) > static void synchronize_rcu_tasks_generic(struct rcu_tasks *rtp) > { > /* Complain if the scheduler has not started. */ >- WARN_ONCE(rcu_scheduler_active == RCU_SCHEDULER_INACTIVE, >- "synchronize_rcu_tasks called too soon"); >+ if (WARN_ONCE(rcu_scheduler_active == RCU_SCHEDULER_INACTIVE, >+ "synchronize_%s() called too soon", rtp->name)) Thanks Paul, detailed description and modification 😊. >+ return; > > // If the grace-period kthread is running, use it. > if (READ_ONCE(rtp->kthread_ptr)) { >@@ -1064,9 +1065,6 @@ static void rcu_tasks_be_rude(struct work_struct *work) > // Wait for one rude RCU-tasks grace period. > static void rcu_tasks_rude_wait_gp(struct rcu_tasks *rtp) > { >- if (num_online_cpus() <= 1) >- return; // Fastpath for only one CPU. >- > rtp->n_ipis += cpumask_weight(cpu_online_mask); > schedule_on_each_cpu(rcu_tasks_be_rude); > }