Re: NFS TCP race condition with SOCK_ASYNC_NOSPACE

Andrew Cooper <andrew.cooper3@xxxxxxxxxx> · Mon, 21 Nov 2011 18:14:17 +0000

Following some debugging, I believe that the attached patch fixes the
problem.

Simply returning EAGAIN is not sufficient, as the task does not get
requeued, and times out 13 seconds later (as per our mount options). 
Setting the SOCK_ASYNC_NOSPACE bit causes the requeue to happen.

I realize that this is a gross hack and I should probably not be using
SOCK_ASYNC_NOSPACE in that way.  Is there a better way to achieve the
same solution?

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com

diff -r 69bd2176baf9 net/sunrpc/xprtsock.c

--- a/net/sunrpc/xprtsock.c	Mon Nov 07 13:00:06 2011 +0000
+++ b/net/sunrpc/xprtsock.c	Mon Nov 21 18:00:14 2011 +0000
@@ -503,17 +503,16 @@ static int xs_nospace(struct rpc_task *t
 
 	/* Don't race with disconnect */
 	if (xprt_connected(xprt)) {
-		if (test_bit(SOCK_ASYNC_NOSPACE, &transport->sock->flags)) {
-			ret = -EAGAIN;
-			/*
-			 * Notify TCP that we're limited by the application
-			 * window size
-			 */
-			set_bit(SOCK_NOSPACE, &transport->sock->flags);
-			transport->inet->sk_write_pending++;
-			/* ...and wait for more buffer space */
-			xprt_wait_for_buffer_space(task, xs_nospace_callback);
-		}
+		set_bit(SOCK_ASYNC_NOSPACE, &transport->sock->flags);
+		ret = -EAGAIN;
+		/*
+		 * Notify TCP that we're limited by the application
+		 * window size
+		 */
+		set_bit(SOCK_NOSPACE, &transport->sock->flags);
+		transport->inet->sk_write_pending++;
+		/* ...and wait for more buffer space */
+		xprt_wait_for_buffer_space(task, xs_nospace_callback);
 	} else {
 		clear_bit(SOCK_ASYNC_NOSPACE, &transport->sock->flags);
 		ret = -ENOTCONN;