Good diagram, thank you! But that call to on_each_cpu() looks like this: on_each_cpu(rcu_barrier_func, NULL, 0, 1); Doesn't that last parameter of "1" mean that on_each_cpu() cannot return until after CPU 3's call to rcu_barrier_func() finishes? Thanx, Paul On Fri, Oct 07, 2022 at 04:03:12PM +0800, 5486 wrote: > 原始邮件 > > > > 发件人:"Paul E. McKenney"< paulmck@xxxxxxxxxx >; > > 发件时间:2022/10/7 2:43 > > 收件人:"5486"< 3164135486@xxxxxx >; > > 抄送人:"rcu"< rcu@xxxxxxxxxxxxxxx >; > > 主题:回复:回复:回复:Re: > > > On Fri, Oct 07, 2022 at 12:55:49AM +0800, 5486 wrote: > > > Note that the last argument to on_each_cpu() is "1", that is, > > > on_each_cpu() does not return until rcu_barrier_func() has executed on > > > all CPUs. > > o . > > > > > > > > >Can your scenario happen? > > > > rcu_barrier_func execution is not enough,the goal is finish all the rcu callback functions. > > > > so it can. > > OK, then please show the sequence of calls to rcu_barrier_func() and > rcu_barrier_callback() on their respective CPUs, along with their effects > on rcu_barrier_cpu_count, in order to show exactly how this can happen. > > Thanx, Paul > > > 原始邮件 > > > > > > > > 发件人:"Paul E. McKenney"< paulmck@xxxxxxxxxx >; > > > > 发件时间:2022/10/7 0:46 > > > > 收件人:"5486"< 3164135486@xxxxxx >; > > > > 抄送人:"rcu"< rcu@xxxxxxxxxxxxxxx >; > > > > 主题:回复:回复:Re: > > > > > > On Fri, Oct 07, 2022 at 12:38:56AM +0800, 5486 wrote: > > > In fact suppose there are for cpuscpu0 cpu1 cpu2 cpu3 > > > > > > > > > the value of rcu_barrier_cpu_count may be > > > > > > > > > 1 cpu0 2 -> 1 cpu1 2 -> 1 cpu2 2 -> 1 > > > then on_each_cpu finished rcu_barrier_cpu_count decreased to be 0 > > > then complete(&rcu_barrier_completion); rcu_barrier finished.but rcu_barrier_func of cpu3 may not have been executed. > > > > Note that the last argument to on_each_cpu() is "1", that is, > > on_each_cpu() does not return until rcu_barrier_func() has executed on > > all CPUs. > > > > Can your scenario happen? > > > > Thanx, Paul > > > > > 原始邮件 > > > > > > > > > > > > 发件人:"Paul E. McKenney"< paulmck@xxxxxxxxxx >; > > > > > > 发件时间:2022/10/6 19:57 > > > > > > 收件人:"5486"< 3164135486@xxxxxx >; > > > > > > 抄送人:"rcu"< rcu@xxxxxxxxxxxxxxx >; > > > > > > 主题:回复:Re: > > > > > > > > > On Thu, Oct 06, 2022 at 05:56:35PM +0800, 5486 wrote: > > > > My problem is some grace period can take very little time shorter than the one iteration of on_each_cpu itself。So the complete() can execute on one cpu before the last of all. > > > > > > And that is the bug. Very good! > > > > > > > but now > > > > The key is on_each_cpu ,it is really clever that on_each_cpu will prevent completion of any grace periods. Concealed Mechanism! > > > > Is it? > > > > In this ancient version code have real race-condition bugs .I will take some time to study to find it and reply to you. > > > > it is about on_each_cpu have not prohibit preemption? ,I did not cover more things related to the operationg system. > > > > > > Suppose the code instead read as follows? > > > > > > void rcu_barrier(void) > > > { > > > BUG_ON(in_interrupt()); > > > /* Take cpucontrol semaphore to protect against CPU hotplug */ > > > down(&rcu_barrier_sema); > > > init_completion(&rcu_barrier_completion); > > > atomic_set(&rcu_barrier_cpu_count, 1); > > > on_each_cpu(rcu_barrier_func, NULL, 0, 1); > > > if (atomic_dec_and_test(&rcu_barrier_cpu_count)) > > > complete(&rcu_barrier_completion); > > > wait_for_completion(&rcu_barrier_completion); > > > up(&rcu_barrier_sema); > > > } > > > > > > Would that work? Or are there additional bugs? > > > > > > Thanx, Paul > > > > > > > Chao,Li > > > > > > > > > > > > 原始邮件 > > > > > > > > > > > > > > > > 发件人:"Paul E. McKenney"< paulmck@xxxxxxxxxx >; > > > > > > > > 发件时间:2022/10/5 21:01 > > > > > > > > 收件人:"5486"< 3164135486@xxxxxx >; > > > > > > > > 抄送人:"rcu"< rcu@xxxxxxxxxxxxxxx >; > > > > > > > > 主题:Re: Dear Pualmack help i have a problem with rcu code > > > > > > > > > > > > On Tue, Oct 04, 2022 at 03:04:19AM +0800, 5486 wrote: > > > > > Dear Pualmack i have problem. > > > > > I have been reading the patch > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/kernel?id=ab4720ec76b756e1f8705e207a7b392b0453afd6 > > > > > > > > > > > > > > > > > > > > + wait_for_completion(&rcu_barrier_completion); > > > > > > > > > > How this guarantees all completes to be done ? > > > > > I learn one wait_for_completion corresponds to one x->done++ once a time > > > > > > > > 2005! That takes me back a bit! > > > > > > > > Adding the rcu email list on CC to archive this and for others to > > > > weigh in. > > > > > > > > The trick is that rcu_barrier_callback() is an RCU callback that executes > > > > on each CPU (see the on_each_cpu() in the rcu_barrier() function.) > > > > > > > > This RCU callback function is queued using call_rcu(), which queues > > > > that callback at the end of the CPU's callback list. Thus, all > > > > RCU callback functions that were already queued on that CPU will > > > > be invoked before rcu_barrier_callback() in invoked. > > > > > > > > The rcu_barrier_callback() does an atomic_dec_and_test() operation on the > > > > rcu_barrier_cpu_count global variable, and only when this variable becomes > > > > zero does it invoke complete(). This atomic_dec_and_test() operation is > > > > fully ordered, and thus ensures that all the earlier atomic_dec_and_test() > > > > operations on this variable are globally seen to happen before the last > > > > such operation (the one that decrements rcu_barrier_cpu_count to zero). > > > > > > > > This last rcu_barrier_callback() will then invoke complete(). This will > > > > ensure that the wakeup in wait_for_completion() happens only after all > > > > previously queued RCU callbacks are invoked. > > > > > > > > Almost, anyway. > > > > > > > > There are rare but real race-condition bugs in this ancient version > > > > of rcu_barrier(). Can you tell me what they are? > > > > > > > > Thanx, Paul