Re: [PATCH RFC] SUNRPC: Backchannel RPCs don't fail when the transport disconnects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Apr 3, 2020, at 4:05 PM, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
> 
> Hi Trond, thanks for the look!
> 
>> On Apr 3, 2020, at 4:00 PM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
>> 
>> On Fri, 2020-04-03 at 15:42 -0400, Chuck Lever wrote:
>>> Commit 3832591e6fa5 ("SUNRPC: Handle connection issues correctly on
>>> the back channel") intended to make backchannel RPCs fail
>>> immediately when there is no forward channel connection. What's
>>> currently happening is, when the forward channel conneciton goes
>>> away, backchannel operations are causing hard loops because
>>> call_transmit_status's SOFTCONN logic ignores ENOTCONN.
>> 
>> Won't RPC_TASK_NOCONNECT do the right thing? It should cause the
>> request to exit with an ENOTCONN error when it hits call_connect().
> 
> OK, so does that mean SOFTCONN is entirely the wrong semantic here?
> 
> Was RPC_TASK_NOCONNECT available when 3832591e6fa5 was merged?

It turns out 3832591e6fa5 is not related. It's 58255a4e3ce5 that added
RPC_TASK_SOFTCONN on NFSv4 callback Calls.

However, the server uses nfsd4_run_cb_work() for both NFSv4.0 and
NFSv4.1 callbacks. IMO a fix for this will have to take care that
RPC_TASK_NOCONNECT is not set on NFSv4.0 callback tasks.

Unfortunately the looping behavior appears also related to the
RPC_IS_SIGNALLED bug I reported earlier this week. The CB request is
signalled as it is firing up, and then drops into a hard loop.

            nfsd-1986  [001]   123.028240: svc_drop:             addr=192.168.2.51:52077 xid=0x489a88f6 flags=RQ_SECURE|RQ_USEDEFERRAL|RQ_SPLICE_OK|RQ_BUSY
   kworker/u8:12-442   [003]   123.028242: rpc_task_signalled:   task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE status=0 action=rpc_prepare_task
   kworker/u8:10-440   [000]   123.028289: rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=rpc_prepare_task
   kworker/u8:10-440   [000]   123.028289: rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_start
   kworker/u8:10-440   [000]   123.028291: rpc_request:          task:47@3 nfs4_cbv1 CB_RECALL (async)
   kworker/u8:10-440   [000]   123.028291: rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_reserve
   kworker/u8:10-440   [000]   123.028298: rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_reserveresult
   kworker/u8:10-440   [000]   123.028299: rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_refresh
   kworker/u8:10-440   [000]   123.028324: rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_refreshresult
   kworker/u8:10-440   [000]   123.028324: rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_allocate
   kworker/u8:10-440   [000]   123.028340: rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_encode
   kworker/u8:10-440   [000]   123.028356: xprt_enq_xmit:        task:47@3 xid=0x8b95e935 seqno=0 stage=4
   kworker/u8:10-440   [000]   123.028357: rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_XMIT|NEED_RECV|SIGNALLED status=0 action=call_transmit
   kworker/u8:10-440   [000]   123.028363: xprt_reserve_cong:    task:47@3 snd_task:47 cong=0 cwnd=256
   kworker/u8:10-440   [000]   123.028365: xprt_transmit:        task:47@3 xid=0x8b95e935 seqno=0 status=-512
   kworker/u8:10-440   [000]   123.028368: xprt_release_cong:    task:47@3 snd_task:4294967295 cong=0 cwnd=256
   kworker/u8:10-440   [000]   123.028368: rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=0 action=call_transmit_status
   kworker/u8:10-440   [000]   123.028371: rpc_task_sleep:       task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=0 timeout=0 queue=xprt_pending
   kworker/u8:10-440   [000]   123.028376: rpc_task_wakeup:      task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|QUEUED|ACTIVE|NEED_RECV|SIGNALLED status=-107 timeout=9000 queue=xprt_pending
   kworker/u8:10-440   [000]   123.028377: rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=-107 action=xprt_timer
   kworker/u8:10-440   [000]   123.028377: rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=-107 action=call_status
   kworker/u8:10-440   [000]   123.028378: rpc_call_status:      task:47@3 status=-107
   kworker/u8:10-440   [000]   123.028378: rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=0 action=call_encode
   kworker/u8:10-440   [000]   123.028380: xprt_enq_xmit:        task:47@3 xid=0x8b95e935 seqno=0 stage=4
   kworker/u8:10-440   [000]   123.028380: rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_XMIT|NEED_RECV|SIGNALLED status=0 action=call_transmit
   kworker/u8:10-440   [000]   123.028381: xprt_reserve_cong:    task:47@3 snd_task:47 cong=0 cwnd=256
   kworker/u8:10-440   [000]   123.028381: xprt_transmit:        task:47@3 xid=0x8b95e935 seqno=0 status=-512
   kworker/u8:10-440   [000]   123.028382: xprt_release_cong:    task:47@3 snd_task:4294967295 cong=0 cwnd=256
   kworker/u8:10-440   [000]   123.028382: rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=0 action=call_transmit_status

If we want to avoid the early RPC_IS_SIGNALLED check in the scheduler,
I guess an alternative would be to have call_transmit_status() check for
RPC_TASK_SIGNALLED. That's not as broad a fix, but it would address
both the NULL and the CB looping cases, IIUC.


>>> To avoid changing the behavior of call_transmit_status in the
>>> forward direction, make backchannel RPCs return with a different
>>> error than ENOTCONN when they fail.
>>> 
>>> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
>>> ---
>>> net/sunrpc/xprtrdma/svc_rdma_backchannel.c |   15 ++++++++++-----
>>> net/sunrpc/xprtsock.c                      |    6 ++++--
>>> 2 files changed, 14 insertions(+), 7 deletions(-)
>>> 
>>> I'm playing with this fix. It seems to be required in order to get
>>> Kerberos mounts to work under load with NFSv4.1 and later on RDMA.
>>> 
>>> If there are no objections, I can carry this for v5.7-rc ...
>>> 
>>> 
>>> diff --git a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
>>> b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
>>> index d510a3a15d4b..b8a72d7fbcc2 100644
>>> --- a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
>>> +++ b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
>>> @@ -207,11 +207,16 @@ rpcrdma_bc_send_request(struct svcxprt_rdma
>>> *rdma, struct rpc_rqst *rqst)
>>> 
>>> drop_connection:
>>> 	dprintk("svcrdma: failed to send bc call\n");
>>> -	return -ENOTCONN;
>>> +	return -EHOSTUNREACH;
>>> }
>>> 
>>> -/* Send an RPC call on the passive end of a transport
>>> - * connection.
>>> +/**
>>> + * xprt_rdma_bc_send_request - send an RPC backchannel Call
>>> + * @rqst: RPC Call in rq_snd_buf
>>> + *
>>> + * Returns:
>>> + *	%0 if the RPC message has been sent
>>> + *	%-EHOSTUNREACH if the Call could not be sent
>>> */
>>> static int
>>> xprt_rdma_bc_send_request(struct rpc_rqst *rqst)
>>> @@ -225,11 +230,11 @@ xprt_rdma_bc_send_request(struct rpc_rqst
>>> *rqst)
>>> 
>>> 	mutex_lock(&sxprt->xpt_mutex);
>>> 
>>> -	ret = -ENOTCONN;
>>> +	ret = -EHOSTUNREACH;
>>> 	rdma = container_of(sxprt, struct svcxprt_rdma, sc_xprt);
>>> 	if (!test_bit(XPT_DEAD, &sxprt->xpt_flags)) {
>>> 		ret = rpcrdma_bc_send_request(rdma, rqst);
>>> -		if (ret == -ENOTCONN)
>>> +		if (ret < 0)
>>> 			svc_close_xprt(sxprt);
>>> 	}
>>> 
>>> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
>>> index 17cb902e5153..92a358fd2ff0 100644
>>> --- a/net/sunrpc/xprtsock.c
>>> +++ b/net/sunrpc/xprtsock.c
>>> @@ -2543,7 +2543,9 @@ static int bc_sendto(struct rpc_rqst *req)
>>> 	req->rq_xtime = ktime_get();
>>> 	err = xprt_sock_sendmsg(transport->sock, &msg, xdr, 0, marker,
>>> &sent);
>>> 	xdr_free_bvec(xdr);
>>> -	if (err < 0 || sent != (xdr->len + sizeof(marker)))
>>> +	if (err < 0)
>>> +		return -EHOSTUNREACH;
>>> +	if (sent != (xdr->len + sizeof(marker)))
>>> 		return -EAGAIN;
>>> 	return sent;
>>> }
>>> @@ -2567,7 +2569,7 @@ static int bc_send_request(struct rpc_rqst
>>> *req)
>>> 	 */
>>> 	mutex_lock(&xprt->xpt_mutex);
>>> 	if (test_bit(XPT_DEAD, &xprt->xpt_flags))
>>> -		len = -ENOTCONN;
>>> +		len = -EHOSTUNREACH;
>>> 	else
>>> 		len = bc_sendto(req);
>>> 	mutex_unlock(&xprt->xpt_mutex);
>>> 
>> -- 
>> Trond Myklebust
>> Linux NFS client maintainer, Hammerspace
>> trond.myklebust@xxxxxxxxxxxxxxx
> 
> --
> Chuck Lever

--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux