Re: [PATCH v5 04/18] rcu: Fix late wakeup when flush of bypass cblist happens

Frederic Weisbecker <frederic@xxxxxxxxxx> · Fri, 2 Sep 2022 13:35:00 +0200

On Thu, Sep 01, 2022 at 10:17:06PM +0000, Joel Fernandes (Google) wrote:
> When the bypass cblist gets too big or its timeout has occurred, it is
> flushed into the main cblist. However, the bypass timer is still running
> and the behavior is that it would eventually expire and wake the GP
> thread.
> 
> Since we are going to use the bypass cblist for lazy CBs, do the wakeup
> soon as the flush happens. Otherwise, the lazy-timer will go off much
> later and the now-non-lazy cblist CBs can get stranded for the duration
> of the timer.
> 
> This is a good thing to do anyway (regardless of this series), since it
> makes the behavior consistent with behavior of other code paths where queueing
> something into the ->cblist makes the GP kthread in a non-sleeping state
> quickly.
> 
> Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
> ---
>  kernel/rcu/tree_nocb.h | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> index 0a5f0ef41484..31068dd31315 100644
> --- a/kernel/rcu/tree_nocb.h
> +++ b/kernel/rcu/tree_nocb.h
> @@ -447,7 +447,13 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>  			rcu_advance_cbs_nowake(rdp->mynode, rdp);
>  			rdp->nocb_gp_adv_time = j;
>  		}
> -		rcu_nocb_unlock_irqrestore(rdp, flags);
> +
> +		// The flush succeeded and we moved CBs into the ->cblist.
> +		// However, the bypass timer might still be running. Wakeup the
> +		// GP thread by calling a helper with was_all_done set so that
> +		// wake up happens (needed if main CB list was empty before).
> +		__call_rcu_nocb_wake(rdp, true, flags)
> +

Ok so there are two different changes here:

1) wake up nocb_gp as we just flushed the bypass list. Indeed if the regular
   callback list was empty before flushing, we rather want to immediately wake
   up nocb_gp instead of waiting for the bypass timer to process them.

2) wake up nocb_gp unconditionally (ie: even if the regular queue was not empty
   before bypass flushing) so that nocb_gp_wait() is forced through another loop
   starting with cancelling the bypass timer (I suggest you put such explanation
   in the comment btw because that process may not be obvious for mortals).

The change 1) looks like a good idea to me.

The change 2) has unclear motivation. It forces nocb_gp_wait() through another
costly loop even though the timer might have been cancelled into some near
future, eventually avoiding that extra costly loop. Also it abuses the
was_alldone stuff and we may get rcu_nocb_wake with incoherent meanings
(WakeEmpty/WakeEmptyIsDeferred) when it's actually not empty.

So you may need to clarify the purpose. And I would suggest to make two patches
here.

Thanks!

>  		return true; // Callback already enqueued.
>  	}
>  
> -- 
> 2.37.2.789.g6183377224-goog
>