On Tue, 2012-10-23 at 16:08 -0400, Chris Perl wrote: > On Tue, Oct 23, 2012 at 04:02:27PM +0000, Myklebust, Trond wrote: > > call_transmit() should normally just cause the rpc_task to detect that > > the connection is down, and should cause it to go to call_connect in > > order to fix everything. > > > > If it is causing a race, then I'm thinking that is probably due to the > > call to xs_tcp_shutdown() when we get an EPIPE error in > > xs_tcp_send_request(). That shutdown seems to be needed only in the case > > when we're in TCP_CLOSE_WAIT, in which case what we really want is to > > force the rpc_task to go to xprt_connect, which means clearing > > xprt_connected(). > > > > Hmm.... Can you try applying the attached 2 patches? > > I rebuilt with those two patches as well as the first one that removed > the `xs_error_report' callback. > > Good news, it works! > > However, looking at the interactions, it still seems a little bit racy. > Clearly you know more about this than I do, so I'll defer to your better > judgement, but let me lay out what I believe is happening: > > Stat Thread rpciod > ------------- -------- > - woken up with tk_status set to > EAGAIN via xs_tcp_state_change > > - tries call_transmit again, which > calls xs_tcp_send_request which > comes back with ECONNRESET (b/c this > is what sk_err is) > > - call_status sees this and goes to > sleep for 3 seconds and then sets > tk_action to call_bind > > - call_bind > > - call_connect, queues execution of > xs_tcp_setup_socket with rpciod and > goes to sleep > > - starts doing its thing, but wakes > up our other thread before its done > trying to connect the socket (or > failing to do so) via the call > stack mentioned before (via > xprt_disconnect_done) > > - tk_status is set to EAGAIN, so > call_connect_status moves on to > call_transmit > > - call_transmit calls > xs_tcp_send_request again which > finds the socket with sk_shutdown != > 0 and so returns EPIPE > > - call_status sees the EPIPE and sets > tk_action back to call_bind > > - continues on and updates the state > of the socket to TCP_SYN_SENT > > - When the connection is established, > xs_tcp_state_change takes care of > marking the xprt connected > > - call_bind > > - call_connect, however this time it > schedules no work because it finds > that the transport is already > connected > > - call_transmit, succeeds, we recover > > This is how the recovery went when I was testing, but I could imagine a > situation where different timing comes into play and we wind up asking > rpciod to execute xs_tcp_setup_socket twice (because the first connect > hasn't been established yet). I haven't thought too hard about what > that would mean, but I imagine it probably wouldn't be good. > > Like I said, you know way more about this than I do, I just wanted to point this > out. Its entirely possible that I'm missing or misunderstanding something. The only problem there would be the call to xs_sock_mark_closed() in xs_abort_connection. As far as I can tell, all we want to do there is clear all those flags. How about the following? Cheers Trond 8<------------------------------------------------------------------ >From c2d1bd4ab6c85cdbdf3380de2269817092fb9a11 Mon Sep 17 00:00:00 2001 From: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> Date: Tue, 23 Oct 2012 17:50:07 -0400 Subject: [PATCH] SUNRPC: Prevent races in xs_abort_connection() The call to xprt_disconnect_done() that is triggered by a successful connection reset will trigger another automatic wakeup of all tasks on the xprt->pending rpc_wait_queue. In particular it will cause an early wake up of the task that called xprt_connect(). All we really want to do here is clear all the socket-specific state flags, so we split that functionality out of xs_sock_mark_closed() into a helper that can be called by xs_abort_connection() Reported-by: Chris Perl <chris.perl@xxxxxxxxx> Signed-off-by: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> Cc: stable@xxxxxxxxxxxxxxx --- net/sunrpc/xprtsock.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index 7e2dd0d..1f105c2 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -1453,7 +1453,7 @@ static void xs_tcp_cancel_linger_timeout(struct rpc_xprt *xprt) xprt_clear_connecting(xprt); } -static void xs_sock_mark_closed(struct rpc_xprt *xprt) +static void xs_sock_reset_connection_flags(struct rpc_xprt *xprt) { smp_mb__before_clear_bit(); clear_bit(XPRT_CONNECTION_ABORT, &xprt->state); @@ -1461,6 +1461,11 @@ static void xs_sock_mark_closed(struct rpc_xprt *xprt) clear_bit(XPRT_CLOSE_WAIT, &xprt->state); clear_bit(XPRT_CLOSING, &xprt->state); smp_mb__after_clear_bit(); +} + +static void xs_sock_mark_closed(struct rpc_xprt *xprt) +{ + xs_sock_reset_connection_flags(xprt); /* Mark transport as closed and wake up all pending tasks */ xprt_disconnect_done(xprt); } @@ -2051,10 +2056,8 @@ static void xs_abort_connection(struct sock_xprt *transport) any.sa_family = AF_UNSPEC; result = kernel_connect(transport->sock, &any, sizeof(any), 0); if (!result) - xs_sock_mark_closed(&transport->xprt); - else - dprintk("RPC: AF_UNSPEC connect return code %d\n", - result); + xs_sock_reset_connection_flags(&transport->xprt); + dprintk("RPC: AF_UNSPEC connect return code %d\n", result); } static void xs_tcp_reuse_connection(struct sock_xprt *transport) -- 1.7.11.7 -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@xxxxxxxxxx www.netapp.com ��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥