On Thu, Aug 27, 2020 at 06:55:18PM -0400, Joel Fernandes wrote: > On Wed, Aug 26, 2020 at 07:20:28AM -0700, Paul E. McKenney wrote: > [...] > > > > Or better yet, please see below, which should allow getting rid of both > > > > of them. > > > > > > > > > rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs); > > > > > rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs); > > > > > - rcu_segcblist_insert_count(dst_rsclp, &donecbs); > > > > > + > > > > > + rcu_segcblist_add_len(dst_rsclp, src_len); > > > > > rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs); > > > > > rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs); > > > > > > > > Rather than adding the blank lines, why not have the rcu_cblist structures > > > > carry the lengths? You are already adjusting one of the two call sites > > > > that care (rcu_do_batch()), and the other is srcu_invoke_callbacks(). > > > > That should shorten this function a bit more. And make callback handling > > > > much more approachable, I suspect. > > > > > > Sorry, I did not understand. The rcu_cblist structure already has a length > > > field. I do modify rcu_segcblist_extract_done_cbs() and > > > rcu_segcblist_extract_pend_cbs() to carry the length already, in a later > > > patch. > > > > > > Just to emphasize, this patch is just a small refactor to avoid an issue in > > > later patches. It aims to keep current functionality unchanged. > > > > True enough. I am just suggesting that an equally small refactor in > > a slightly different direction should get to a better place. The key > > point enabling this slightly different direction is that this code is > > an exception to the "preserve ->cblist.len" rule because it is invoked > > only from the CPU hotplug code. > > > > So you could use the rcu_cblist .len field to update the ->cblist.len > > field, thus combining the _cbs and _count updates. One thing that helps > > is that setting th e rcu_cblist .len field doesn't hurt the other use > > cases that require careful handling of ->cblist.len. > > Thank you for the ideas. I am trying something like this on top of this > series based on the ideas. One thing I concerned a bit is if getting rid of > the rcu_segcblist_xchg_len() function (which has memory barriers in them) > causes issues in the hotplug path. I am now directly updating the length > without additional memory barriers. I will test it more and try to reason > more about it as well. In this particular case, the CPU-hotplug locks prevent rcu_barrier() from running concurrently, so it should be OK. Is there an easy way to make lockdep help us check this? Does lockdep_assert_cpus_held() suffice, or is it too easily satisfied? > ---8<----------------------- > > From: Joel Fernandes <joelaf@xxxxxxxxxx> > Date: Thu, 27 Aug 2020 18:30:25 -0400 > Subject: [PATCH] fixup! rcu/segcblist: Do not depend on donecbs ->len to store > the segcb len during merge > > Signed-off-by: Joel Fernandes <joelaf@xxxxxxxxxx> > --- > kernel/rcu/rcu_segcblist.c | 38 ++++---------------------------------- > 1 file changed, 4 insertions(+), 34 deletions(-) > > diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c > index 79c2cbe388c5..c33abbc97a07 100644 > --- a/kernel/rcu/rcu_segcblist.c > +++ b/kernel/rcu/rcu_segcblist.c > @@ -175,26 +175,6 @@ void rcu_segcblist_inc_len(struct rcu_segcblist *rsclp) > rcu_segcblist_add_len(rsclp, 1); > } > > -/* > - * Exchange the numeric length of the specified rcu_segcblist structure > - * with the specified value. This can cause the ->len field to disagree > - * with the actual number of callbacks on the structure. This exchange is > - * fully ordered with respect to the callers accesses both before and after. > - */ > -static long rcu_segcblist_xchg_len(struct rcu_segcblist *rsclp, long v) > -{ > -#ifdef CONFIG_RCU_NOCB_CPU > - return atomic_long_xchg(&rsclp->len, v); > -#else > - long ret = rsclp->len; > - > - smp_mb(); /* Up to the caller! */ > - WRITE_ONCE(rsclp->len, v); > - smp_mb(); /* Up to the caller! */ > - return ret; > -#endif > -} > - This looks nice! > /* > * Initialize an rcu_segcblist structure. > */ > @@ -361,6 +341,7 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp, > if (rsclp->tails[i] == rsclp->tails[RCU_DONE_TAIL]) > WRITE_ONCE(rsclp->tails[i], &rsclp->head); > rcu_segcblist_set_seglen(rsclp, RCU_DONE_TAIL, 0); > + rcu_segcblist_add_len(rsclp, -(rclp->len)); > } > > /* > @@ -414,17 +395,7 @@ void rcu_segcblist_extract_pend_cbs(struct rcu_segcblist *rsclp, > WRITE_ONCE(rsclp->tails[i], rsclp->tails[RCU_DONE_TAIL]); > rcu_segcblist_set_seglen(rsclp, i, 0); > } > -} > - > -/* > - * Insert counts from the specified rcu_cblist structure in the > - * specified rcu_segcblist structure. > - */ > -void rcu_segcblist_insert_count(struct rcu_segcblist *rsclp, > - struct rcu_cblist *rclp) > -{ > - rcu_segcblist_add_len(rsclp, rclp->len); > - rclp->len = 0; > + rcu_segcblist_add_len(rsclp, -(rclp->len)); As does this. ;-) > } > > /* > @@ -448,6 +419,7 @@ void rcu_segcblist_insert_done_cbs(struct rcu_segcblist *rsclp, > break; > rclp->head = NULL; > rclp->tail = &rclp->head; > + rcu_segcblist_add_len(rsclp, rclp->len); Does there need to be a compensating action in rcu_do_batch(), or is this the point of the "rcu_segcblist_add_len(rsclp, -(rclp->len));" added to rcu_segcblist_extract_done_cbs() above? If so, a comment would be good. > } > > /* > @@ -463,6 +435,7 @@ void rcu_segcblist_insert_pend_cbs(struct rcu_segcblist *rsclp, > rcu_segcblist_add_seglen(rsclp, RCU_NEXT_TAIL, rclp->len); > WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rclp->head); > WRITE_ONCE(rsclp->tails[RCU_NEXT_TAIL], rclp->tail); > + rcu_segcblist_add_len(rsclp, rclp->len); > } > > /* > @@ -601,16 +574,13 @@ void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp, > { > struct rcu_cblist donecbs; > struct rcu_cblist pendcbs; > - long src_len; > > rcu_cblist_init(&donecbs); > rcu_cblist_init(&pendcbs); > > - src_len = rcu_segcblist_xchg_len(src_rsclp, 0); > rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs); > rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs); > > - rcu_segcblist_add_len(dst_rsclp, src_len); > rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs); > rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs); Can we now pair the corresponding _extract_ and _insert_ calls, thus requiring only one rcu_cblist structure? This would drop two more lines of code. Or would that break something? Thanx, Paul