On Sat, Oct 17, 2020 at 08:35:56PM -0400, joel@xxxxxxxxxxxxxxxxx wrote: > On Sat, Oct 17, 2020 at 03:29:54PM +0200, Frederic Weisbecker wrote: > > > C rcubarrier+ctrldep > > > > > > (* > > > * Result: Never > > > * > > > * This litmus test shows that rcu_barrier (P1) prematurely > > > * returning by reading len 0 can cause issues if P0 does > > > * NOT have a smb_mb() after WRITE_ONCE(len, 1). > > > * mod_data == 2 means module was unloaded (so data is garbage). > > > *) > > > > > > { int len = 0; int enq = 0; } > > > > > > P0(int *len, int *mod_data, int *enq) > > > { > > > int r0; > > > > > > WRITE_ONCE(*len, 1); > > > smp_mb(); /* Needed! */ > > > WRITE_ONCE(*enq, 1); > > > > > > r0 = READ_ONCE(*mod_data); > > > } > > > > > > P1(int *len, int *mod_data, int *enq) > > > { > > > int r0; > > > int r1; > > > > > > r1 = READ_ONCE(*enq); > > > > > > // barrier Just for test purpose ("exists" clause) to force the.. > > > // ..rcu_barrier() to see enq before len > > > smp_mb(); > > > r0 = READ_ONCE(*len); > > > > > > // implicit memory barrier due to conditional */ > > > if (r0 == 0) > > > WRITE_ONCE(*mod_data, 2); > > > } > > > > I'm not sure what scenario P1 refers to in practice, and to what module? > > Kernel module usecase for rcu_barrier. See the docs. My bad, I'm just reading that documentation now :-s > > > > I'm very likely missing something obvious somewhere. > > > > CPU 0 CPU 1 > > rcu_barrier() call_rcu()/rcu_segcblist_enqueue() > > ------------ -------- > > > > smp_mb(); > > inc_len(); > > smp_mb(); > > queue callback; > > for_each_possible_cpu(cpu) > > if (!rcu_segcblist_n_cbs(&rdp->cblist)) > > continue; > > > > > invoke_callback > > If CPU 0 saw the enqueue of the callback (that is the CPU 1's writes to the > segcb_list propagated to CPU 0), then it would have also seen the > effects of the inc_len. I forced this case in my last litmus test by this > code in P1(): But then I can't find to which part of rcu_barrier() this refers to. I see the len read before anything else. > > r1 = READ_ONCE(*enq); > smp_mb(); /* barrier Just for test purpose to show that the.. */ > /* ..rcu_barrier() saw list modification */ > > On the other hand, if CPU 0 did not see the enqueue, then there is really no > issue. Since that is the same case where call_rcu() happened _after_ the > rcu_barrier() and there's no race. rcu_barrier() does not need to wait if > there was no callback enqueued. > > This is not exactly the easiest thing to explain, hence the litmus. Now, reading the documentation of rcu_barrier() (thanks to you!): Pseudo-code using rcu_barrier() is as follows: 1. Prevent any new RCU callbacks from being posted. 2. Execute rcu_barrier(). 3. Allow the module to be unloaded. I think with point 1, it is assumed that the caller of rcu_barrier() must have not only stopped but also sync'ed with the possible enqueuers. Correct me if I'm wrong here. So for example if a kthread used to post the module RCU callbacks, calling kthread_stop() does the job as it prevents from further RCU callbacks from being enqueued and it also syncs with the kthread thanks to the completion implied by any caller of kthread_stop() which then sees what the kthread has read and written, including RCU callbacks enqueued. So if the caller of kthread_stop() calls rcu_barrier() right after, rcu_barrier() should see at least the len corresponding to the last enqueue. cancel_work_sync() also seem to really sync as well. I'm less sure about del_timer_sync(). Say we have: expire_timers (CPU 0) CPU 1 ------------- ----------- detach_timer(timer) raw_spin_unlock(&base->lock); call_timer_fn(timer, fn, baseclk); -> enqueue callback //would need at least smp_wmb() here base->running_timer = NULL; del_timer_sync() { raw_spin_lock(&base->lock); if (base->running_timer != timer) ret = detach_if_pending(timer, base, true); if (!timer_pending()) return 0; raw_spin_unlock(&base->lock); } //would need at least smp_rmb() here //although rcu_seq_start() implies a full barrier rcu_barrier() { // Sees rcu_segcblist_n_cbs(rdp(CPU 0)->cblist) == 0 // So ignore it But I'm sure I'm missing something obvious. That's my specialism.