On Thu, Oct 06, 2022 at 05:56:35PM +0800, 5486 wrote: > My problem is some grace period can take very little time shorter than the one iteration of on_each_cpu itself。So the complete() can execute on one cpu before the last of all. And that is the bug. Very good! > but now > The key is on_each_cpu ,it is really clever that on_each_cpu will prevent completion of any grace periods. Concealed Mechanism! > Is it? > In this ancient version code have real race-condition bugs .I will take some time to study to find it and reply to you. > it is about on_each_cpu have not prohibit preemption? ,I did not cover more things related to the operationg system. Suppose the code instead read as follows? void rcu_barrier(void) { BUG_ON(in_interrupt()); /* Take cpucontrol semaphore to protect against CPU hotplug */ down(&rcu_barrier_sema); init_completion(&rcu_barrier_completion); atomic_set(&rcu_barrier_cpu_count, 1); on_each_cpu(rcu_barrier_func, NULL, 0, 1); if (atomic_dec_and_test(&rcu_barrier_cpu_count)) complete(&rcu_barrier_completion); wait_for_completion(&rcu_barrier_completion); up(&rcu_barrier_sema); } Would that work? Or are there additional bugs? Thanx, Paul > Chao,Li > > > 原始邮件 > > > > 发件人:"Paul E. McKenney"< paulmck@xxxxxxxxxx >; > > 发件时间:2022/10/5 21:01 > > 收件人:"5486"< 3164135486@xxxxxx >; > > 抄送人:"rcu"< rcu@xxxxxxxxxxxxxxx >; > > 主题:Re: Dear Pualmack help i have a problem with rcu code > > > On Tue, Oct 04, 2022 at 03:04:19AM +0800, 5486 wrote: > > Dear Pualmack i have problem. > > I have been reading the patch > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/kernel?id=ab4720ec76b756e1f8705e207a7b392b0453afd6 > > > > > > > > + wait_for_completion(&rcu_barrier_completion); > > > > How this guarantees all completes to be done ? > > I learn one wait_for_completion corresponds to one x->done++ once a time > > 2005! That takes me back a bit! > > Adding the rcu email list on CC to archive this and for others to > weigh in. > > The trick is that rcu_barrier_callback() is an RCU callback that executes > on each CPU (see the on_each_cpu() in the rcu_barrier() function.) > > This RCU callback function is queued using call_rcu(), which queues > that callback at the end of the CPU's callback list. Thus, all > RCU callback functions that were already queued on that CPU will > be invoked before rcu_barrier_callback() in invoked. > > The rcu_barrier_callback() does an atomic_dec_and_test() operation on the > rcu_barrier_cpu_count global variable, and only when this variable becomes > zero does it invoke complete(). This atomic_dec_and_test() operation is > fully ordered, and thus ensures that all the earlier atomic_dec_and_test() > operations on this variable are globally seen to happen before the last > such operation (the one that decrements rcu_barrier_cpu_count to zero). > > This last rcu_barrier_callback() will then invoke complete(). This will > ensure that the wakeup in wait_for_completion() happens only after all > previously queued RCU callbacks are invoked. > > Almost, anyway. > > There are rare but real race-condition bugs in this ancient version > of rcu_barrier(). Can you tell me what they are? > > Thanx, Paul