Re: [PATCH v1 1/2] Revert "SUNRPC: Reduce thread wake-up rate when receiving large RPC messages"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 04 Jan 2025, Jeff Layton wrote:
> On Thu, 2025-01-02 at 20:00 -0500, cel@xxxxxxxxxx wrote:
> > From: Chuck Lever <chuck.lever@xxxxxxxxxx>
> > 
> > I noticed that a handful of NFSv3 fstests were taking an
> > unexpectedly long time to run. Troubleshooting showed that the
> > server's TCP window closed and never re-opened, which caused the
> > client to trigger an RPC retransmit timeout after 180 seconds.
> > 
> > The client's recovery action was to establish a fresh connection
> > and retransmit the timed-out requests. This worked, but it adds a
> > long delay.
> > 
> > I tracked the problem to the commit that attempted to reduce the
> > rate at which the network layer delivers TCP socket data_ready
> > callbacks. Under most circumstances this change worked as expected,
> > but for NFSv3, which has no session or other type of throttling, it
> > can overwhelm the receiver on occasion.
> > 
> > I'm sure I could tweak the lowat settings, but the small benefit
> > doesn't seem worth the bother. Just revert it.
> > 
> > Fixes: 2b877fc53e97 ("SUNRPC: Reduce thread wake-up rate when receiving large RPC messages")
> > Cc: Jakub Kicinski <kuba@xxxxxxxxxx>
> > Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
> > ---
> >  net/sunrpc/svcsock.c | 12 +-----------
> >  1 file changed, 1 insertion(+), 11 deletions(-)
> > 
> > diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> > index 95397677673b..cb3bd12f5818 100644
> > --- a/net/sunrpc/svcsock.c
> > +++ b/net/sunrpc/svcsock.c
> > @@ -1083,9 +1083,6 @@ static void svc_tcp_fragment_received(struct svc_sock *svsk)
> >  	/* If we have more data, signal svc_xprt_enqueue() to try again */
> >  	svsk->sk_tcplen = 0;
> >  	svsk->sk_marker = xdr_zero;
> > -
> > -	smp_wmb();
> > -	tcp_set_rcvlowat(svsk->sk_sk, 1);
> >  }
> >  
> >  /**
> > @@ -1175,17 +1172,10 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
> >  		goto err_delete;
> >  	if (len == want)
> >  		svc_tcp_fragment_received(svsk);
> > -	else {
> > -		/* Avoid more ->sk_data_ready() calls until the rest
> > -		 * of the message has arrived. This reduces service
> > -		 * thread wake-ups on large incoming messages. */
> > -		tcp_set_rcvlowat(svsk->sk_sk,
> > -				 svc_sock_reclen(svsk) - svsk->sk_tcplen);
> 
> Could we instead clamp this to the window size so that we at least only
> do a wakeup for each window-size chunk?

tcp_set_rcvlowat() already imposes and upper limit - based on configured
queue space.  So I don't think we need to clamp.

I note that the calculation is wrong. sk_tcplen includes the size of the
4byte length header, but svc_sock_reclen() does not.  Other places there
these two numbers are compared, sizeof(rpc_fraghdr) is subtracted from
sk_tcplen.
So there is a corner case where the size passed to tcp_set_rcvlowat() is
negative.  The two numbers in the subtraction are u32 and the function
expects "int".
I didn't dig enough to understand how an negative would be handled.

> 
> > -
> > +	else
> >  		trace_svcsock_tcp_recv_short(&svsk->sk_xprt,
> >  				svc_sock_reclen(svsk),
> >  				svsk->sk_tcplen - sizeof(rpc_fraghdr));
> > -	}
> >  	goto err_noclose;
> >  error:
> >  	if (len != -EAGAIN)
> 
> Reverting for now is fine though.

Agreed.  I would expect modern network cards to deliver fairly large
packets to the application already.  Still, it would be nice to
understand what is really happening.  Setting rcvlowat shouldn't be able
to cause the window size to go to zero.

NeilBrown


> 
> Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>
> 






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux