On Thu, Jun 30, 2022 at 12:01:14AM +0200, Frederic Weisbecker wrote: > On Wed, Jun 29, 2022 at 08:29:48PM +0000, Joel Fernandes wrote: > > On Wed, Jun 29, 2022 at 01:53:49PM +0200, Frederic Weisbecker wrote: > > > On Wed, Jun 22, 2022 at 10:50:55PM +0000, Joel Fernandes (Google) wrote: > > > > @@ -414,30 +427,37 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp, > > > > } > > > > WRITE_ONCE(rdp->nocb_nobypass_count, c); > > > > > > > > - // If there hasn't yet been all that many ->cblist enqueues > > > > - // this jiffy, tell the caller to enqueue onto ->cblist. But flush > > > > - // ->nocb_bypass first. > > > > - if (rdp->nocb_nobypass_count < nocb_nobypass_lim_per_jiffy) { > > > > + // If caller passed a non-lazy CB and there hasn't yet been all that > > > > + // many ->cblist enqueues this jiffy, tell the caller to enqueue it > > > > + // onto ->cblist. But flush ->nocb_bypass first. Also do so, if total > > > > + // number of CBs (lazy + non-lazy) grows too much. > > > > + // > > > > + // Note that if the bypass list has lazy CBs, and the main list is > > > > + // empty, and rhp happens to be non-lazy, then we end up flushing all > > > > + // the lazy CBs to the main list as well. That's the right thing to do, > > > > + // since we are kick-starting RCU GP processing anyway for the non-lazy > > > > + // one, we can just reuse that GP for the already queued-up lazy ones. > > > > + if ((rdp->nocb_nobypass_count < nocb_nobypass_lim_per_jiffy && !lazy) || > > > > + (lazy && n_lazy_cbs >= qhimark)) { > > > > rcu_nocb_lock(rdp); > > > > *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist); > > > > if (*was_alldone) > > > > trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, > > > > - TPS("FirstQ")); > > > > - WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j)); > > > > + lazy ? TPS("FirstLazyQ") : TPS("FirstQ")); > > > > + WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j, false)); > > > > > > That's outside the scope of this patchset but this makes me realize we > > > unconditionally try to flush the bypass from call_rcu() fastpath, and > > > therefore we unconditionally lock the bypass lock from call_rcu() fastpath. > > > > > > It shouldn't be contended at this stage since we are holding the nocb_lock > > > already, and only the local CPU can hold the nocb_bypass_lock without holding > > > the nocb_lock. But still... > > > > > > It looks safe to locklessly early check if (rcu_cblist_n_cbs(&rdp->nocb_bypass)) > > > before doing anything. Only the local CPU can enqueue to the bypass list. > > > > > > Adding that to my TODO list... > > > > > > > I am afraid I did not understand your comment. The bypass list lock is held > > once we have decided to use the bypass list to queue something on to it. > > > > The bypass flushing is also conditional on either the bypass cblist growing > > too big or a jiffie elapsing since the first bypass queue. > > > > So in both cases, acquiring the lock is conditional. What do you mean it is > > unconditionally acquiring the bypass lock? Where? > > Just to make sure we are talking about the same thing, I'm referring to this > path: > > // If there hasn't yet been all that many ->cblist enqueues > // this jiffy, tell the caller to enqueue onto ->cblist. But flush > // ->nocb_bypass first. > if (rdp->nocb_nobypass_count < nocb_nobypass_lim_per_jiffy) { > rcu_nocb_lock(rdp); > *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist); > if (*was_alldone) > trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, > TPS("FirstQ")); > WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j)); > WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass)); > return false; // Caller must enqueue the callback. > } > > This is called whenever we decide not to queue to the bypass list because > there is no flooding detected (rdp->nocb_nobypass_count hasn't reached > nocb_nobypass_lim_per_jiffy for the current jiffy). I call this the fast path > because this is what I would except in a normal load, as opposed to callbacks > flooding. > > And in this fastpath, the above rcu_nocb_flush_bypass() is unconditional. Sorry you are right, I see that now. Another reason for why the contention is probably not a big deal (other than the nocb lock being held), is that all other callers of the flush appear to be in slow paths except for this one. Unless someone is offloading/deoffloading rapidly or something :) thanks, - Joel