On Thu, Aug 04 2016, Trond Myklebust wrote: >> On Aug 3, 2016, at 19:33, NeilBrown <neilb@xxxxxxxx> wrote: >> >> >> This reverts commit 4b0ab51db32eba0f48b7618254742f143364a28d. >> >> This change causes 'rpc.nfsd' to hang for long time if rpcbind is not >> available. >> If >> --no-nfs-version 2 --no-nfs-version 3 >> is given, the delay is about 6.5 minutes. When trying to register >> all versions, the delay is over half an hour. >> Before this commit, and after reverting it, nfsd fails (when v3 is >> requested) or succeeds (when only v4 is requested) immediately. >> >> Signed-off-by: NeilBrown <neilb@xxxxxxxx> >> --- >> net/sunrpc/xprtsock.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c >> index 111767ab124a..2a938055e95b 100644 >> --- a/net/sunrpc/xprtsock.c >> +++ b/net/sunrpc/xprtsock.c >> @@ -795,6 +795,7 @@ static void xs_sock_mark_closed(struct rpc_xprt *xprt) >> xs_sock_reset_connection_flags(xprt); >> /* Mark transport as closed and wake up all pending tasks */ >> xprt_disconnect_done(xprt); >> + xprt_force_disconnect(xprt); >> } >> > > If there is an outstanding request, then _that_ is supposed to redrive the connection. Why is the xprt_disconnect_done() not functioning as per the comment above it? No idea, sorry. I haven't tried to follow though exactly what is happening. The symptom is trivial to reproduce: rpc.nfsd 0 killall rpcbind rpc.nfsd 8 This will either complete immediately with kernel log messages, or block for a long time. I might try to figure out what is happening, but I suspect there are others who know this code a lot better than I do. NeilBrown
Attachment:
signature.asc
Description: PGP signature