On Tue, Oct 13, 2020 at 01:20:08AM +0200, Frederic Weisbecker wrote: > On Wed, Sep 23, 2020 at 11:22:09AM -0400, Joel Fernandes (Google) wrote: > > +/* Return number of callbacks in a segment of the segmented callback list. */ > > +static void rcu_segcblist_add_seglen(struct rcu_segcblist *rsclp, int seg, long v) > > +{ > > +#ifdef CONFIG_RCU_NOCB_CPU > > + smp_mb__before_atomic(); /* Up to the caller! */ > > + atomic_long_add(v, &rsclp->seglen[seg]); > > + smp_mb__after_atomic(); /* Up to the caller! */ > > +#else > > + smp_mb(); /* Up to the caller! */ > > + WRITE_ONCE(rsclp->seglen[seg], rsclp->seglen[seg] + v); > > + smp_mb(); /* Up to the caller! */ > > +#endif > > +} > > I know that these "Up to the caller" comments come from the existing len > functions but perhaps we should explain a bit more against what it is ordering > and what it pairs to. > > Also why do we need one before _and_ after? > > And finally do we have the same ordering requirements than the unsegmented len > field? Hi Paul and Neeraj, Would be nice to discuss this on the call. I actually borrowed the memory barriers from add_len() just to be safe, but I think Frederic's points are valid. Would be nice if we can go over all the usecases and discuss which memory barriers are needed. Thanks for your help! Another thought: inc_len() calls add_len() which already has smp_mb(), so callers of inc_len also do not need memory barriers I think. thanks, - Joel > > + > > +/* Move from's segment length to to's segment. */ > > +static void rcu_segcblist_move_seglen(struct rcu_segcblist *rsclp, int from, int to) > > +{ > > + long len; > > + > > + if (from == to) > > + return; > > + > > + len = rcu_segcblist_get_seglen(rsclp, from); > > + if (!len) > > + return; > > + > > + rcu_segcblist_add_seglen(rsclp, to, len); > > + rcu_segcblist_set_seglen(rsclp, from, 0); > > +} > > + > [...] > > @@ -245,6 +283,7 @@ void rcu_segcblist_enqueue(struct rcu_segcblist *rsclp, > > struct rcu_head *rhp) > > { > > rcu_segcblist_inc_len(rsclp); > > + rcu_segcblist_inc_seglen(rsclp, RCU_NEXT_TAIL); > > smp_mb(); /* Ensure counts are updated before callback is enqueued. */ > > Since inc_len and even now inc_seglen have two full barriers embracing the add up, > we can probably spare the above smp_mb()? > > > rhp->next = NULL; > > WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rhp); > > @@ -274,27 +313,13 @@ bool rcu_segcblist_entrain(struct rcu_segcblist *rsclp, > > for (i = RCU_NEXT_TAIL; i > RCU_DONE_TAIL; i--) > > if (rsclp->tails[i] != rsclp->tails[i - 1]) > > break; > > + rcu_segcblist_inc_seglen(rsclp, i); > > WRITE_ONCE(*rsclp->tails[i], rhp); > > for (; i <= RCU_NEXT_TAIL; i++) > > WRITE_ONCE(rsclp->tails[i], &rhp->next); > > return true; > > } > > > > @@ -403,6 +437,7 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq) > > if (ULONG_CMP_LT(seq, rsclp->gp_seq[i])) > > break; > > WRITE_ONCE(rsclp->tails[RCU_DONE_TAIL], rsclp->tails[i]); > > + rcu_segcblist_move_seglen(rsclp, i, RCU_DONE_TAIL); > > Do we still need the same amount of full barriers contained in add() called by move() here? > It's called in the reverse order (write queue then len) than usual. If I trust the comment > in rcu_segcblist_enqueue(), the point of the barrier is to make the length visible before > the new callback for rcu_barrier() (although that concerns len and not seglen). But here > above, the unsegmented length doesn't change. I could understand a write barrier between > add_seglen(x, i) and set_seglen(0, RCU_DONE_TAIL) but I couldn't find a paired couple either. > > > } > > > > /* If no callbacks moved, nothing more need be done. */ > > @@ -423,6 +458,7 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq) > > if (rsclp->tails[j] == rsclp->tails[RCU_NEXT_TAIL]) > > break; /* No more callbacks. */ > > WRITE_ONCE(rsclp->tails[j], rsclp->tails[i]); > > + rcu_segcblist_move_seglen(rsclp, i, j); > > Same question here (feel free to reply "same answer" :o) > > Thanks!