On Thu, Aug 04 2016, Trond Myklebust wrote: >> On Aug 3, 2016, at 19:33, NeilBrown <neilb@xxxxxxxx> wrote: >> >> >> This reverts commit 4b0ab51db32eba0f48b7618254742f143364a28d. >> >> This change causes 'rpc.nfsd' to hang for long time if rpcbind is not >> available. >> If >> --no-nfs-version 2 --no-nfs-version 3 >> is given, the delay is about 6.5 minutes. When trying to register >> all versions, the delay is over half an hour. >> Before this commit, and after reverting it, nfsd fails (when v3 is >> requested) or succeeds (when only v4 is requested) immediately. >> >> Signed-off-by: NeilBrown <neilb@xxxxxxxx> >> --- >> net/sunrpc/xprtsock.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c >> index 111767ab124a..2a938055e95b 100644 >> --- a/net/sunrpc/xprtsock.c >> +++ b/net/sunrpc/xprtsock.c >> @@ -795,6 +795,7 @@ static void xs_sock_mark_closed(struct rpc_xprt *xprt) >> xs_sock_reset_connection_flags(xprt); >> /* Mark transport as closed and wake up all pending tasks */ >> xprt_disconnect_done(xprt); >> + xprt_force_disconnect(xprt); >> } >> > > If there is an outstanding request, then _that_ is supposed to redrive the connection. Why is the xprt_disconnect_done() not functioning as per the comment above it? I made some time to look at this and have a somewhat clearer understanding. The rpcbind code makes about 24 requests, include NULL pings, UNSET requests to clear any old registrations, and SET requests. The first (a NULL) fails immediately due to ECONNREFUSED. Others aren't even attempted for a growing number of seconds because xs_connect() finds that transport->sock is not NULL, so it delays before trying. Once it tries, it failed immediately and we move on to the next request. transport->sock is set to NULL by xs_reset_transport() which xs_connect() calls as part of the the timeout, but ->reestablish_timeout is not reset. The next time xs_connect is called, ->sock will be non-NULL again as transport->reestablish_timeout will be longer. transport->sock is only set to NULL *before* the test in xs_connect() by a call to xs_close() or ->close(). But nothing in triggering this. xs_sock_mark_closed() currently doesn't actually mark the socket as closed (XPRT_CLOSE_WAITING), or close it directly. static void xs_sock_mark_closed(struct rpc_xprt *xprt) { xs_sock_reset_connection_flags(xprt); /* Mark transport as closed and wake up all pending tasks */ xprt_disconnect_done(xprt); } xs_sock_reset_connection_flags() clears XPRT_CLOSE_WAITING, XPRT_CLOSING, and XPRT_SOCK_DATA_READY, and adds some memory barrier. xprt_disconnect_done() clears XPRT_CONNECTED and wakes up the task. So none of this will "Mark transport as closed" like the comment suggests. They just mark it as disconnected. xs_sock_reset_connection_flags() even *clears* XPRT_CLOSE_WAITING, which we really want to be set. The commit said: Under all conditions, it should be quite sufficient just to mark the socket as disconnected. It will then be closed by the transport shutdown or reconnect code. The "reconnect code" is presumably xprt_connect()? This will close the socket if XPRT_CLOSE_WAIT is set., but not if the socket is marked as disconnected. When marked as disconnected, it tries to reconnect - which incurs a timeout. The "transport shutdown" code is presumably xprt_destroy()? Yes, that would close the socket, but that doesn't straight away. The rpcbind client can try 10 UNSET calls before destroying the transport. It is possible that __svc_unregister should return an error so that svc_unregister() can abort the loop on error. That would reduce the timeouts to some extent, but not completely as we would still get a timeout when trying to register. It might make sense that if the svc_unregister() in svc_rpcb_setup() fails due to ECONNREFUSED, then it should fail so we never even try to register. Plumbing that status through nfsd so that it doesn't abort in the V4-only case when rpcbind isn't available could be done, but seems like a lot of work for little gain. Also, the error that comes through to __svc_unregister is "-EAGAIN" rather then -ECONNREFUSED. I think that the easiest solution is to make sure reconnection attempts on an RPC_IS_SOFTCONN tasks do not trigger a delay in xs_connect(). call_connect_status() already adds a 3second delay for other tasks, so simply forcing a close in xs_sock_marked_close() should be sufficient. The revert I posted would do that, or we could just add set_bit(XPRT_CLOSE_WAIT, &xprt->state); so that the comment would be correct. That is sufficient. Another approach which works is: diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 7f79fb7dc6a0..f80b135b4830 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -1936,8 +1936,10 @@ call_connect_status(struct rpc_task *task) case -EADDRINUSE: case -ENOBUFS: case -EPIPE: - if (RPC_IS_SOFTCONN(task)) + if (RPC_IS_SOFTCONN(task)) { + xprt_force_disconnect(task->tk_rqstp->rq_xprt); break; + } /* retry with existing socket, after a delay */ rpc_delay(task, 3*HZ); case -EAGAIN: so that we force a disconnect precisely when a connection attempt fails on a SOFTCONN task. Which of these solutions do you prefer? Thanks, NeilBrown
Attachment:
signature.asc
Description: PGP signature