Re: [PATCH] NFSD/sunrpc: avoid deadlock on TCP connection due to memory pressure.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 25, 2013 at 11:30:23AM +1000, NeilBrown wrote:
> 
> Since we enabled auto-tuning for sunrpc TCP connections we do not
> guarantee that there is enough write-space on each connection to
> queue a reply.
> 
> If memory pressure causes the window to shrink too small, the request
> throttling in sunrpc/svc will not accept any requests so no more requests
> will be handled.  Even when pressure decreases the window will not
> grow again until data is sent on the connection.
> This means we get a deadlock:  no requests will be handled until there
> is more space, and no space will be allocated until a request is
> handled.
> 
> This can be simulated by modifying svc_tcp_has_wspace to inflate the
> number of byte required and removing the 'svc_sock_setbufsize' calls
> in svc_setup_socket.

Ah-hah!

> I found that multiplying by 16 was enough to make the requirement
> exceed the default allocation.  With this modification in place:
>    mount -o vers=3,proto=tcp 127.0.0.1:/home /mnt
> would block and eventually time out because the nfs server could not
> accept any requests.

So, this?:

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 305374d..36de50d 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -1193,6 +1193,7 @@ static int svc_tcp_has_wspace(struct svc_xprt *xprt)
 	if (test_bit(XPT_LISTENER, &xprt->xpt_flags))
 		return 1;
 	required = atomic_read(&xprt->xpt_reserved) + serv->sv_max_mesg;
+	required *= 16;
 	if (sk_stream_wspace(svsk->sk_sk) >= required)
 		return 1;
 	set_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
@@ -1378,14 +1379,8 @@ static struct svc_sock *svc_setup_socket(struct svc_serv *serv,
 	/* Initialize the socket */
 	if (sock->type == SOCK_DGRAM)
 		svc_udp_init(svsk, serv);
-	else {
-		/* initialise setting must have enough space to
-		 * receive and respond to one request.
-		 */
-		svc_sock_setbufsize(svsk->sk_sock, 4 * serv->sv_max_mesg,
-					4 * serv->sv_max_mesg);
+	else
 		svc_tcp_init(svsk, serv);
-	}
 
 	dprintk("svc: svc_setup_socket created %p (inet %p)\n",
 				svsk, svsk->sk_sk);

> This patch relaxes the request throttling to always allow at least one
> request through per connection.  It does this by checking both
>   sk_stream_min_wspace() and xprt->xpt_reserved
> are zero.
> The first is zero when the TCP transmit queue is empty.
> The second is zero when there are no RPC requests being processed.
> When both of these are zero the socket is idle and so one more
> request can safely be allowed through.
> 
> Applying this patch allows the above mount command to succeed cleanly.
> Tracing shows that the allocated write buffer space quickly grows and
> after a few requests are handled, the extra tests are no longer needed
> to permit further requests to be processed.
> 
> The main purpose of request throttling is to handle the case when one
> client is slow at collecting replies and the send queue gets full of
> replies that the client hasn't acknowledged (at the TCP level) yet.
> As we only change behaviour when the send queue is empty this main
> purpose is still preserved.
> 
> Reported-by: Ben Myers <bpm@xxxxxxx>
> Signed-off-by: NeilBrown <neilb@xxxxxxx>
> 
> --
> As you can see I've changed the patch.  While writing up the above 
> description realised there was a weakness and so added the sk_stream_min_wspace
> test.  That allowed me to write the final paragraph.

This is great, thanks!

Inclined to queue it up for 3.11 and stable....

--b.

> 
> NeilBrown
> 
> 
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index 305374d..7762b9f 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -1193,7 +1193,9 @@ static int svc_tcp_has_wspace(struct svc_xprt *xprt)
>  	if (test_bit(XPT_LISTENER, &xprt->xpt_flags))
>  		return 1;
>  	required = atomic_read(&xprt->xpt_reserved) + serv->sv_max_mesg;
> -	if (sk_stream_wspace(svsk->sk_sk) >= required)
> +	if (sk_stream_wspace(svsk->sk_sk) >= required ||
> +	    (sk_stream_min_wspace(svsk->sk_sk) == 0 &&
> +	     atomic_read(&xprt->xpt_reserved) == 0))
>  		return 1;
>  	set_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
>  	return 0;


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux