On Sun, Apr 19, 2020 at 08:24:45AM -0700, Paul E. McKenney wrote: >On Sun, Apr 19, 2020 at 06:02:12AM +0800, Wei Yang wrote: >> On Sat, Apr 18, 2020 at 11:19 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: >> > >> > On Fri, Apr 17, 2020 at 09:39:51PM +0000, Wei Yang wrote: >> > > There is only 1 bit set in mask, which means the difference between >> > > oldmask and the new one would be at the position where the bit is set in >> > > mask. >> > > >> > > Based on this knowledge, rcu_state.ncpus could be calculated by checking >> > > whether mask is already set in oldmask. >> > >> > Nice!!! Good eyes! >> > >> > > BTW, the comment at the last of this line is mysterious. Not sure it >> > > could be removed or not. >> > >> > The "^^^" in that comment says to look at the comment on the preceding >> > line. Memory-ordering functions like smp_store_release() are supposed >> > to have comments indicating what they are ordering. ;-) >> > >> > Could you please do the following things and resubmit? >> > >> > 1. Forward-port to -rcu branch dev? This tree lives here: >> > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git >> > >> > 2. Given that oldmask is used only to test to see if a new bit >> > was set, why not just replace oldmask with a bool variable >> > that is set to "!(rnp->expmaskinitnext & mask)" before the >> > bit is ORed into rnp->expmaskinitnext? >> > >> > 3. Put the comment inside the "if" statement with the >> > smp_store_release(). >> > >> > 4. In -rcu, you will find a ASSERT_EXCLUSIVE_WRITER() statement >> > that should also be placed inside the "if" statement with >> > the smp_store_release(). >> > >> >> Oops, my email client EAT this mail. Hope this mail will not be banned. > >This one made it. ;-) > >> I adjust the code a little according to your suggestion like below. Is this >> what you expected? > >Much better, thank you! > >> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c >> index f288477ee1c2..f01367a80b70 100644 >> --- a/kernel/rcu/tree.c >> +++ b/kernel/rcu/tree.c >> @@ -3732,10 +3732,9 @@ void rcu_cpu_starting(unsigned int cpu) >> { >> unsigned long flags; >> unsigned long mask; >> - int nbits; >> - unsigned long oldmask; >> struct rcu_data *rdp; >> struct rcu_node *rnp; >> + bool has_seen; > >Could you please use a more descriptive name, perhaps something like >"newcpu"? Yes, "has_seen" does make sense, but it leaves the reader >the question "has seen what?". > >> if (per_cpu(rcu_cpu_started, cpu)) >> return; >> @@ -3747,13 +3746,13 @@ void rcu_cpu_starting(unsigned int cpu) >> mask = rdp->grpmask; >> raw_spin_lock_irqsave_rcu_node(rnp, flags); >> WRITE_ONCE(rnp->qsmaskinitnext, rnp->qsmaskinitnext | mask); >> - oldmask = rnp->expmaskinitnext; >> + has_seen = rnp->expmaskinitnext & mask; >> rnp->expmaskinitnext |= mask; >> - oldmask ^= rnp->expmaskinitnext; >> - nbits = bitmap_weight(&oldmask, BITS_PER_LONG); >> - /* Allow lockless access for expedited grace periods. */ >> - smp_store_release(&rcu_state.ncpus, rcu_state.ncpus + nbits); /* ^^^ */ >> - ASSERT_EXCLUSIVE_WRITER(rcu_state.ncpus); >> + if (!has_seen) { >> + /* Allow lockless access for expedited grace periods. */ >> + smp_store_release(&rcu_state.ncpus, rcu_state.ncpus + >> 1); /* ^^^ */ >> + ASSERT_EXCLUSIVE_WRITER(rcu_state.ncpus); > >One advantage of flipping the conditional is that it allows dropping the >"if" statement, which saves a few lines of code and gets closer to your >original suggestion: > > newcpu = !(rnp->expmaskinitnext & mask); > rnp->expmaskinitnext |= mask; > /* Allow lockless access for expedited grace periods. */ > smp_store_release(&rcu_state.ncpus, rcu_state.ncpus + newcpu); /* ^^^ */ > ASSERT_EXCLUSIVE_WRITER(rcu_state.ncpus); > >Yes, this does execute a store-release that doesn't change anything, >but this is OK because (1) CPU hotplug is infrequent and (2) CPU hotplug >has very high overhead. The overhead of the store is thus neglible. >(But please do not try this to heavily contended cachelines on fastpaths! I thought I got your point. > >And yes, I should have thought of this in response to your initial patch, >but then again, I should have thought of this when writing this code in >the first place, shouldn't I have? ;-) > Faire enough. > Thanx, Paul > >> + } >> rcu_gpnum_ovf(rnp, rdp); /* Offline-induced counter wrap? */ >> rdp->rcu_onl_gp_seq = READ_ONCE(rcu_state.gp_seq); >> rdp->rcu_onl_gp_flags = READ_ONCE(rcu_state.gp_flags); >> >> > Thanx, Paul >> > >> > > Signed-off-by: Wei Yang <richard.weiyang@xxxxxxxxx> >> > > --- >> > > kernel/rcu/tree.c | 6 ++---- >> > > 1 file changed, 2 insertions(+), 4 deletions(-) >> > > >> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c >> > > index d91c9156fab2..f0d9251fa663 100644 >> > > --- a/kernel/rcu/tree.c >> > > +++ b/kernel/rcu/tree.c >> > > @@ -3364,7 +3364,6 @@ void rcu_cpu_starting(unsigned int cpu) >> > > { >> > > unsigned long flags; >> > > unsigned long mask; >> > > - int nbits; >> > > unsigned long oldmask; >> > > struct rcu_data *rdp; >> > > struct rcu_node *rnp; >> > > @@ -3381,10 +3380,9 @@ void rcu_cpu_starting(unsigned int cpu) >> > > rnp->qsmaskinitnext |= mask; >> > > oldmask = rnp->expmaskinitnext; >> > > rnp->expmaskinitnext |= mask; >> > > - oldmask ^= rnp->expmaskinitnext; >> > > - nbits = bitmap_weight(&oldmask, BITS_PER_LONG); >> > > /* Allow lockless access for expedited grace periods. */ >> > > - smp_store_release(&rcu_state.ncpus, rcu_state.ncpus + nbits); /* ^^^ */ >> > > + if (!(oldmask & mask)) >> > > + smp_store_release(&rcu_state.ncpus, rcu_state.ncpus + 1); /* ^^^ */ >> > > rcu_gpnum_ovf(rnp, rdp); /* Offline-induced counter wrap? */ >> > > rdp->rcu_onl_gp_seq = READ_ONCE(rcu_state.gp_seq); >> > > rdp->rcu_onl_gp_flags = READ_ONCE(rcu_state.gp_flags); >> > > -- >> > > 2.23.0 >> > > -- Wei Yang Help you, Help me