> On Dec 2, 2022, at 7:52 AM, Zhang, Qiang1 <qiang1.zhang@xxxxxxxxx> wrote: > > On Thu, Dec 01, 2022 at 07:45:33AM +0800, Zqiang wrote: >> Currently, invoke rcu_tasks_rude_wait_gp() to wait one rude >> RCU-tasks grace period, if __num_online_cpus == 1, will return >> directly, indicates the end of the rude RCU-task grace period. >> suppose the system has two cpus, consider the following scenario: >> >> CPU0 CPU1 (going offline) >> migration/1 task: >> cpu_stopper_thread >> -> take_cpu_down >> -> _cpu_disable >> (dec __num_online_cpus) >> ->cpuhp_invoke_callback >> preempt_disable >> access old_data0 >> task1 >> del old_data0 ..... >> synchronize_rcu_tasks_rude() >> task1 schedule out >> .... >> task2 schedule in >> rcu_tasks_rude_wait_gp() >> ->__num_online_cpus == 1 >> ->return >> .... >> task1 schedule in >> ->free old_data0 >> preempt_enable >> >> when CPU1 dec __num_online_cpus and __num_online_cpus is equal one, >> the CPU1 has not finished offline, stop_machine task(migration/1) >> still running on CPU1, maybe still accessing 'old_data0', but the >> 'old_data0' has freed on CPU0. >> >> In order to prevent the above scenario from happening, this commit >> remove check for __num_online_cpus == 0 and add handling of calling >> synchronize_rcu_tasks_generic() during early boot(when the >> rcu_scheduler_active variable is RCU_SCHEDULER_INACTIVE, the scheduler >> not yet initialized and only one boot-CPU online). >> >> Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx> > >> Very good, thank you! I did the usual wordsmithing, including to that >> error message, so as usual please check to make sure that I didn't mess >> something up. >> >> Thanx, Paul >> >> ------------------------------------------------------------------------ >> >> commit 033ddc5d337984e20b9d49c8af4faa4689727626 >> Author: Zqiang <qiang1.zhang@xxxxxxxxx> >> Date: Thu Dec 1 07:45:33 2022 +0800 >> >> rcu-tasks: Make rude RCU-Tasks work well with CPU hotplug >> >> The synchronize_rcu_tasks_rude() function invokes rcu_tasks_rude_wait_gp() >> to wait one rude RCU-tasks grace period. The rcu_tasks_rude_wait_gp() >> function in turn checks if there is only a single online CPU. If so, it >> will immediately return, because a call to synchronize_rcu_tasks_rude() >> is by definition a grace period on a single-CPU system. (We could >> have blocked!) >> >> Unfortunately, this check uses num_online_cpus() without synchronization, >> which can result in too-short grace periods. To see this, consider the >> following scenario: >> >> CPU0 CPU1 (going offline) >> migration/1 task: >> cpu_stopper_thread >> -> take_cpu_down >> -> _cpu_disable >> (dec __num_online_cpus) >> ->cpuhp_invoke_callback >> preempt_disable >> access old_data0 >> task1 >> del old_data0 ..... >> synchronize_rcu_tasks_rude() >> task1 schedule out >> .... >> task2 schedule in >> rcu_tasks_rude_wait_gp() >> ->__num_online_cpus == 1 >> ->return >> .... >> task1 schedule in >> ->free old_data0 >> preempt_enable >> >> When CPU1 decrements __num_online_cpus, its value becomes 1. However, >> CPU1 has not finished going offline, and will take one last trip through >> the scheduler and the idle loop before it actually stops executing >> instructions. Because synchronize_rcu_tasks_rude() is mostly used for >> tracing, and because both the scheduler and the idle loop can be traced, >> this means that CPU0's prematurely ended grace period might disrupt the >> tracing on CPU1. Given that this disruption might include CPU1 executing >> instructions in memory that was just now freed (and maybe reallocated), >> this is a matter of some concern. >> >> This commit therefore removes that problematic single-CPU check from the >> rcu_tasks_rude_wait_gp() function. This dispenses with the single-CPU >> optimization, but there is no evidence indicating that this optimization >> is important. In addition, synchronize_rcu_tasks_generic() contains a >> similar optimization (albeit only for early boot), which also splats. >> (As in exactly why are you invoking synchronize_rcu_tasks_rude() so >> early in boot, anyway???) >> >> It is OK for the synchronize_rcu_tasks_rude() function's check to be >> unsynchronized because the only times that this check can evaluate to >> true is when there is only a single CPU running with preemption >> disabled. >> >> While in the area, this commit also fixes a minor bug in which a >> call to synchronize_rcu_tasks_rude() would instead be attributed to >> synchronize_rcu_tasks(). >> >> [ paulmck: Add "synchronize_" prefix and "()" suffix. ] >> >> Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx> >> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx> >> >> diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h >> index 4dda8e6e5707f..d845723c1af41 100644 >> --- a/kernel/rcu/tasks.h >> +++ b/kernel/rcu/tasks.h >> @@ -560,8 +560,9 @@ static int __noreturn rcu_tasks_kthread(void *arg) >> static void synchronize_rcu_tasks_generic(struct rcu_tasks *rtp) >> { >> /* Complain if the scheduler has not started. */ >> - WARN_ONCE(rcu_scheduler_active == RCU_SCHEDULER_INACTIVE, >> - "synchronize_rcu_tasks called too soon"); >> + if (WARN_ONCE(rcu_scheduler_active == RCU_SCHEDULER_INACTIVE, >> + "synchronize_%s() called too soon", rtp->name)) > > Thanks Paul, detailed description and modification 😊. True statement. Good example for everyone ;-) - Joel > > >> + return; >> >> // If the grace-period kthread is running, use it. >> if (READ_ONCE(rtp->kthread_ptr)) { >> @@ -1064,9 +1065,6 @@ static void rcu_tasks_be_rude(struct work_struct *work) >> // Wait for one rude RCU-tasks grace period. >> static void rcu_tasks_rude_wait_gp(struct rcu_tasks *rtp) >> { >> - if (num_online_cpus() <= 1) >> - return; // Fastpath for only one CPU. >> - >> rtp->n_ipis += cpumask_weight(cpu_online_mask); >> schedule_on_each_cpu(rcu_tasks_be_rude); >> }