Re: CPU stall, eventual host hang with BTRFS + NFS under heavy load

"NeilBrown" <neilb@xxxxxxx> · Fri, 13 Aug 2021 07:36:25 +1000

On Fri, 13 Aug 2021, J.  Bruce Fields wrote:
> On Tue, Aug 10, 2021 at 10:43:31AM +1000, NeilBrown wrote:
> > 
> > The problem here appears to be that a signalled task is being retried
> > without clearing the SIGNALLED flag.  That is causing the infinite loop
> > and the soft lockup.
> > 
> > This bug appears to have been introduced in Linux 5.2 by
> > Commit: ae67bd3821bb ("SUNRPC: Fix up task signalling")
> 
> I wonder how we arrived here.  Does it require that an rpc task returns
> from one of those rpc_delay() calls just as rpc_shutdown_client() is
> signalling it?  That's the only way async tasks get signalled, I think.

I don't think "just as" is needed.
I think it could only happen if rpc_shutdown_client() were called when
there were active tasks - presumably from nfsd4_process_cb_update(), but
I don't know the callback code well.
If any of those active tasks has a ->done handler which might try to
reschedule the task when tk_status == -ERESTARTSYS, then you get into
the infinite loop.

> 
> > Prior to this commit a flag RPC_TASK_KILLED was used, and it gets
> > cleared by rpc_reset_task_statistics() (called from rpc_exit_task()).
> > After this commit a new flag RPC_TASK_SIGNALLED is used, and it is never
> > cleared.
> > 
> > A fix might be to clear RPC_TASK_SIGNALLED in
> > rpc_reset_task_statistics(), but I'll leave that decision to someone
> > else.
> 
> Might be worth testing with that change just to verify that this is
> what's happening.
> 
> diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
> index c045f63d11fa..caa931888747 100644
> --- a/net/sunrpc/sched.c
> +++ b/net/sunrpc/sched.c
> @@ -813,7 +813,8 @@ static void
>  rpc_reset_task_statistics(struct rpc_task *task)
>  {
>  	task->tk_timeouts = 0;
> -	task->tk_flags &= ~(RPC_CALL_MAJORSEEN|RPC_TASK_SENT);
> +	task->tk_flags &= ~(RPC_CALL_MAJORSEEN|RPC_TASK_SIGNALLED|
> +							RPC_TASK_SENT);

NONONONONO.
RPC_TASK_SIGNALLED is a flag in tk_runstate.
So you need
	clear_bit(RPC_TASK_SIGNALLED, &task->tk_runstate);

NeilBrown