Re: [PATCH] Revert "SUNRPC: xs_sock_mark_closed() does not need to trigger socket autoclose"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Aug 9, 2016, at 22:05, NeilBrown <neilb@xxxxxxxx> wrote:
> 
> On Thu, Aug 04 2016, Trond Myklebust wrote:
> 
>>> On Aug 3, 2016, at 19:33, NeilBrown <neilb@xxxxxxxx> wrote:
>>> 
>>> 
>>> This reverts commit 4b0ab51db32eba0f48b7618254742f143364a28d.
>>> 
>>> This change causes 'rpc.nfsd' to hang for long time if rpcbind is not
>>> available.
>>> If
>>>  --no-nfs-version 2 --no-nfs-version 3
>>> is given, the delay is about 6.5 minutes.  When trying to register
>>> all versions, the delay is over half an hour.
>>> Before this commit, and after reverting it, nfsd fails (when v3 is
>>> requested) or succeeds (when only v4 is requested) immediately.
>>> 
>>> Signed-off-by: NeilBrown <neilb@xxxxxxxx>
>>> ---
>>> net/sunrpc/xprtsock.c | 1 +
>>> 1 file changed, 1 insertion(+)
>>> 
>>> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
>>> index 111767ab124a..2a938055e95b 100644
>>> --- a/net/sunrpc/xprtsock.c
>>> +++ b/net/sunrpc/xprtsock.c
>>> @@ -795,6 +795,7 @@ static void xs_sock_mark_closed(struct rpc_xprt *xprt)
>>> 	xs_sock_reset_connection_flags(xprt);
>>> 	/* Mark transport as closed and wake up all pending tasks */
>>> 	xprt_disconnect_done(xprt);
>>> +	xprt_force_disconnect(xprt);
>>> }
>>> 
>> 
>> If there is an outstanding request, then _that_ is supposed to redrive the connection. Why is the xprt_disconnect_done() not functioning as per the comment above it?
> 
> I made some time to look at this and have a somewhat clearer
> understanding.
> 
> The rpcbind code makes about 24 requests, include NULL pings, UNSET
> requests to clear any old registrations, and SET requests.
> 
> The first (a NULL) fails immediately due to ECONNREFUSED.  Others aren't
> even attempted for a growing number of seconds because xs_connect()
> finds that transport->sock is not NULL, so it delays before trying.
> Once it tries, it failed immediately and we move on to the next request.
> 
> transport->sock is set to NULL by xs_reset_transport() which xs_connect() calls
> as part of the the timeout, but ->reestablish_timeout is not reset.
> The next time xs_connect is called, ->sock will be non-NULL again as
> transport->reestablish_timeout will be longer.
> 
> transport->sock is only set to NULL *before* the test in xs_connect() by a
> call to xs_close() or ->close().  But nothing in triggering this.
> 
> xs_sock_mark_closed() currently doesn't actually mark the socket as
> closed (XPRT_CLOSE_WAITING), or close it directly.
> 
> static void xs_sock_mark_closed(struct rpc_xprt *xprt)
> {
> 	xs_sock_reset_connection_flags(xprt);
> 	/* Mark transport as closed and wake up all pending tasks */
> 	xprt_disconnect_done(xprt);
> }
> 
> xs_sock_reset_connection_flags() clears XPRT_CLOSE_WAITING,
> XPRT_CLOSING, and XPRT_SOCK_DATA_READY, and adds some memory barrier.
> 
> xprt_disconnect_done() clears XPRT_CONNECTED and wakes up the task.
> 
> So none of this will "Mark transport as closed" like the comment
> suggests.  They just mark it as disconnected.
> xs_sock_reset_connection_flags() even *clears* XPRT_CLOSE_WAITING, which
> we really want to be set.
> 
> The commit said:
> 
>    Under all conditions, it should be quite sufficient just to mark
>    the socket as disconnected. It will then be closed by the
>    transport shutdown or reconnect code.
> 
> The "reconnect code" is presumably xprt_connect()?  This will close the
> socket if XPRT_CLOSE_WAIT is set., but not if the socket is marked as
> disconnected.  When marked as disconnected, it tries to reconnect -
> which incurs a timeout.
> 
> The "transport shutdown" code is presumably xprt_destroy()?  Yes, that
> would close the socket, but that doesn't straight away.  The rpcbind
> client can try 10 UNSET calls before destroying the transport.
> 
> 
> It is possible that __svc_unregister should return an error so that
> svc_unregister() can abort the loop on error.  That would reduce the
> timeouts to some extent, but not completely as we would still get a
> timeout when trying to register.  It might make sense that if the
> svc_unregister() in svc_rpcb_setup() fails due to ECONNREFUSED, then it
> should fail so we never even try to register.  Plumbing that status
> through nfsd so that it doesn't abort in the V4-only case when rpcbind
> isn't available could be done, but seems like a lot of work for little gain.
> 
> Also, the error that comes through to __svc_unregister is "-EAGAIN"
> rather then -ECONNREFUSED.
> 
> I think that the easiest solution is to make sure reconnection attempts
> on an RPC_IS_SOFTCONN tasks do not trigger a delay in xs_connect().
> call_connect_status() already adds a 3second delay for other tasks, so
> simply forcing a close in xs_sock_marked_close() should be sufficient.
> The revert I posted would do that, or we could just add
> 
> 	set_bit(XPRT_CLOSE_WAIT, &xprt->state);
> 
> so that the comment would be correct.  That is sufficient.
> 
> Another approach which works is:
> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> index 7f79fb7dc6a0..f80b135b4830 100644
> --- a/net/sunrpc/clnt.c
> +++ b/net/sunrpc/clnt.c
> @@ -1936,8 +1936,10 @@ call_connect_status(struct rpc_task *task)
> 	case -EADDRINUSE:
> 	case -ENOBUFS:
> 	case -EPIPE:
> -		if (RPC_IS_SOFTCONN(task))
> +		if (RPC_IS_SOFTCONN(task)) {
> +			xprt_force_disconnect(task->tk_rqstp->rq_xprt);
> 			break;
> +		}
> 		/* retry with existing socket, after a delay */
> 		rpc_delay(task, 3*HZ);
> 	case -EAGAIN:
> 
> so that we force a disconnect precisely when a connection attempt fails
> on a SOFTCONN task.
> 
> Which of these solutions do you prefer?


Instead of the above, could we rather look at adding a call to xprt_conditional_disconnect() to ensure that we don’t keep disconnecting multiple times? It means we’d have to set req->rq_connect_cookie in xprt_connect() before sleeping on xprt->pending, but I don’t think that is a major imposition.

I also think we might not want to make that call conditional on RPC_IS_SOFTCONN.


��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux