On Wed, Nov 16, 2022 at 09:56:26AM +0800, Pingfan Liu wrote: > The pair of segcblist operations: rcu_segcblist_advance() and > rcu_segcblist_accelerate() in srcu_gp_start() is needless from two > perspectives: > > -1. As a part of the SRCU state machine, it should take care of either > all sda or none. But here it only takes care of a single sda. I am not so sure that I agree. Taking care of all srcu_node structures' callbacks would be extremely expensive, with the expense increasing with the number of CPUs. However, the cost of taking care of the current CPU's callbacks is quite cheap, at least in the case where the srcu_struct structure's ->srcu_size_state field is equal to SRCU_SIZE_BIG. But are you seeing performance issues with that code on real hardware running real workloads when the srcu_struct structure's ->srcu_size_state field does not equal SRCU_SIZE_BIG? My guess is "no" given that this code has already paid the cost of acquiring the srcu_struct structure's ->lock, but I figured I should ask. The opinion of real hardware running real workloads beats any of our opinions, after all. ;-) If you are seeing real slowdowns, then one option is to decrease the default value of big_cpu_lim from 128 to some appropriate smaller value. Maybe 16 or 32? Alternatively, the code in srcu_gp_start() that this patch removes might be executed only when ssp->srcu_size_state is equal to SRCU_SIZE_BIG. But this approach would likely still have performance issues due to the acquisition of srcu_struct structure's ->lock, right? Also, if you are seeing real slowdowns, please place that information in the commit log. Thoughts? > -2. From the viewpoint of the callback, at the entrance, > srcu_gp_start_if_needed() has called that pair operations and attached > it with gp_seq. At the exit, srcu_invoke_callbacks() calls that pair > again to extract the done callbacks. So the harvesting of the callback > is not affected by the call to that pair in srcu_gp_start(). > > But because the updating of RCU_DONE_TAIL by srcu_invoke_callbacks() may > have some delay than by srcu_gp_end()->srcu_gp_start(), the removal may > cause srcu_might_be_idle() not to be real time. To compensate that, > supplement that pair just before the calling to rcu_segcblist_pend_cbs() > in srcu_might_be_idle(). I agree that srcu_might_be_idle() is not a bad place to add such a check, especially given that it has other reasons to acquire the srcu_data structure's ->lock. Except that this misses both call_srcu() and synchronize_srcu_expedited(), so that if a workload invokes call_srcu() and/or synchronize_srcu_expedited() on a given srcu_struct structure, but never synchronize)srcu(), the opportunity to advance that CPU's callbacks at the beginning of a grace period will always be lost. Alternatively, if we have checks in both synchronize_srcu() and srcu_gp_start_if_needed(), we end up duplicating the work in the case where synchronize_srcu() is invoked. This is also not particularly good. Again, thoughts? Thanx, Paul > Test info: > torture test passed using the following command against commit 094226ad94f4 ("Linux 6.1-rc5") > tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 10h --configs 18*SRCU-P > > Signed-off-by: Pingfan Liu <kernelfans@xxxxxxxxx> > Cc: Lai Jiangshan <jiangshanlai@xxxxxxxxx> > Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxx> > Cc: Frederic Weisbecker <frederic@xxxxxxxxxx> > Cc: Josh Triplett <josh@xxxxxxxxxxxxxxxx> > Cc: Steven Rostedt <rostedt@xxxxxxxxxxx> > Cc: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> > To: rcu@xxxxxxxxxxxxxxx > --- > kernel/rcu/srcutree.c | 15 ++++----------- > 1 file changed, 4 insertions(+), 11 deletions(-) > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > index 725c82bb0a6a..36ba18967133 100644 > --- a/kernel/rcu/srcutree.c > +++ b/kernel/rcu/srcutree.c > @@ -659,21 +659,10 @@ EXPORT_SYMBOL_GPL(__srcu_read_unlock); > */ > static void srcu_gp_start(struct srcu_struct *ssp) > { > - struct srcu_data *sdp; > int state; > > - if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER) > - sdp = per_cpu_ptr(ssp->sda, 0); > - else > - sdp = this_cpu_ptr(ssp->sda); > lockdep_assert_held(&ACCESS_PRIVATE(ssp, lock)); > WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed)); > - spin_lock_rcu_node(sdp); /* Interrupts already disabled. */ > - rcu_segcblist_advance(&sdp->srcu_cblist, > - rcu_seq_current(&ssp->srcu_gp_seq)); > - (void)rcu_segcblist_accelerate(&sdp->srcu_cblist, > - rcu_seq_snap(&ssp->srcu_gp_seq)); > - spin_unlock_rcu_node(sdp); /* Interrupts remain disabled. */ > WRITE_ONCE(ssp->srcu_gp_start, jiffies); > WRITE_ONCE(ssp->srcu_n_exp_nodelay, 0); > smp_mb(); /* Order prior store to ->srcu_gp_seq_needed vs. GP start. */ > @@ -1037,6 +1026,10 @@ static bool srcu_might_be_idle(struct srcu_struct *ssp) > /* If the local srcu_data structure has callbacks, not idle. */ > sdp = raw_cpu_ptr(ssp->sda); > spin_lock_irqsave_rcu_node(sdp, flags); > + rcu_segcblist_advance(&sdp->srcu_cblist, > + rcu_seq_current(&ssp->srcu_gp_seq)); > + (void)rcu_segcblist_accelerate(&sdp->srcu_cblist, > + rcu_seq_snap(&ssp->srcu_gp_seq)); > if (rcu_segcblist_pend_cbs(&sdp->srcu_cblist)) { > spin_unlock_irqrestore_rcu_node(sdp, flags); > return false; /* Callbacks already present, so not idle. */ > -- > 2.31.1 >