Adding Alan as well as its memory barrier discussion ;-) On Thu, Oct 15, 2020 at 03:35:11PM +0200, Frederic Weisbecker wrote: > On Wed, Oct 14, 2020 at 08:23:01PM -0400, Joel Fernandes (Google) wrote: > > Memory barriers are needed when updating the full length of the > > segcblist, however it is not fully clearly why one is needed before and > > after. This patch therefore adds additional comments to the function > > header to explain it. > > > > Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx> > > --- > > kernel/rcu/rcu_segcblist.c | 38 ++++++++++++++++++++++++++++++++++---- > > 1 file changed, 34 insertions(+), 4 deletions(-) > > > > diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c > > index 271d5d9d7f60..25ffd07f9951 100644 > > --- a/kernel/rcu/rcu_segcblist.c > > +++ b/kernel/rcu/rcu_segcblist.c > > @@ -147,17 +147,47 @@ static void rcu_segcblist_inc_seglen(struct rcu_segcblist *rsclp, int seg) > > * field to disagree with the actual number of callbacks on the structure. > > * This increase is fully ordered with respect to the callers accesses > > * both before and after. > > + * > > + * About memory barriers: > > + * There is a situation where rcu_barrier() locklessly samples the full > > + * length of the segmented cblist before deciding what to do. That can > > + * race with another path that calls this function. rcu_barrier() should > > + * not wrongly assume there are no callbacks, so any transitions from 1->0 > > + * and 0->1 have to be carefully ordered with respect to list modifications. > > + * > > + * Memory barrier is needed before adding to length, for the case where > > + * v is negative which does not happen in current code, but used > > + * to happen. Keep the memory barrier for robustness reasons. > > Heh, I seem to recongnize someone's decision's style ;-) Actually, the last paragraph I added is bogus. Indeed this memory barrier is not just for robustness reasons. It is needed because rcu_do_batch() adjusts the length of the list (possibly to 0) _after_ executing the callbacks, so that's a negative number: rcu_segcblist_add_len(&rdp->cblist, -count); > > When/If the > > + * length transitions from 1 -> 0, the write to 0 has to be ordered *after* > > + * the memory accesses of the CBs that were dequeued and the segcblist > > + * modifications: > > + * P0 (what P1 sees) P1 > > + * set len = 0 > > + * rcu_barrier sees len as 0 > > + * dequeue from list > > + * rcu_barrier does nothing. > > It's a bit difficult to read that way. So that would be: > > > rcu_do_batch() rcu_barrier() > -- -- > dequeue l = READ(len) > smp_mb() if (!l) > WRITE(len, 0) check next CPU... > > But I'm a bit confused against what it pairs in rcu_barrier(). I believe it pairs with an implied memory barrier via control dependency. The following litmus test would confirm it: C rcubarrier+ctrldep (* * Result: Never * * This litmus test shows that rcu_barrier (P1) prematurely * returning by reading len 0 can cause issues if P0 does * NOT have a smb_mb() before WRITE_ONCE(). * * mod_data == 2 means garbage which the callback should never see. *) { int len = 1; } P0(int *len, int *mod_data) { int r0; // accessed by say RCU callback in rcu_do_batch(); r0 = READ_ONCE(*mod_data); smp_mb(); // Remove this and the "exists" will become true. WRITE_ONCE(*len, 0); } P1(int *len, int *mod_data) { int r0; r0 = READ_ONCE(*len); // rcu_barrier will return early if len is 0 if (r0 == 0) WRITE_ONCE(*mod_data, 2); } // Is it possible? exists (0:r0=2 /\ 1:r0=0) > > + * > > + * Memory barrier is needed after adding to length for the case > > + * where length transitions from 0 -> 1. This is because rcu_barrier() > > + * should never miss an update to the length. So the update to length > > + * has to be seen *before* any modifications to the segmented list. Otherwise a > > + * race can happen. > > + * P0 (what P1 sees) P1 > > + * queue to list > > + * rcu_barrier sees len as 0 > > + * set len = 1. > > + * rcu_barrier does nothing. > > So that would be: > > call_rcu() rcu_barrier() > -- -- > WRITE(len, len + 1) l = READ(len) > smp_mb() if (!l) > queue check next CPU... > > > But I still don't see against what it pairs in rcu_barrier. Actually, for the second case maybe a similar reasoning can be applied (control dependency) but I'm unable to come up with a litmus test. In fact, now I'm wondering how is it possible that call_rcu() races with rcu_barrier(). The module should ensure that no more call_rcu() should happen before rcu_barrier() is called. confused, - Joel