Re: [PATCH RFC] SUNRPC: Backchannel RPCs don't fail when the transport disconnects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2020-04-03 at 17:46 -0400, Chuck Lever wrote:
> > On Apr 3, 2020, at 4:05 PM, Chuck Lever <chuck.lever@xxxxxxxxxx>
> > wrote:
> > 
> > Hi Trond, thanks for the look!
> > 
> > > On Apr 3, 2020, at 4:00 PM, Trond Myklebust <
> > > trondmy@xxxxxxxxxxxxxxx> wrote:
> > > 
> > > On Fri, 2020-04-03 at 15:42 -0400, Chuck Lever wrote:
> > > > Commit 3832591e6fa5 ("SUNRPC: Handle connection issues
> > > > correctly on
> > > > the back channel") intended to make backchannel RPCs fail
> > > > immediately when there is no forward channel connection. What's
> > > > currently happening is, when the forward channel conneciton
> > > > goes
> > > > away, backchannel operations are causing hard loops because
> > > > call_transmit_status's SOFTCONN logic ignores ENOTCONN.
> > > 
> > > Won't RPC_TASK_NOCONNECT do the right thing? It should cause the
> > > request to exit with an ENOTCONN error when it hits
> > > call_connect().
> > 
> > OK, so does that mean SOFTCONN is entirely the wrong semantic here?
> > 
> > Was RPC_TASK_NOCONNECT available when 3832591e6fa5 was merged?
> 
> It turns out 3832591e6fa5 is not related. It's 58255a4e3ce5 that
> added
> RPC_TASK_SOFTCONN on NFSv4 callback Calls.
> 
> However, the server uses nfsd4_run_cb_work() for both NFSv4.0 and
> NFSv4.1 callbacks. IMO a fix for this will have to take care that
> RPC_TASK_NOCONNECT is not set on NFSv4.0 callback tasks.

Possibly, but don't we really want to let such a NFSv4.0 request fail
and send another CB_NULL? There is already version-specific code in
nfsd4_process_cb_update() to set up the callback client.

> Unfortunately the looping behavior appears also related to the
> RPC_IS_SIGNALLED bug I reported earlier this week. The CB request is
> signalled as it is firing up, and then drops into a hard loop.
> 

Hmm... Is there any reason why xprt_rdma_alloc_slot() should not be
checking if the transport is being shut down? After all, if the
transport is being destroyed here, then there is no point in allocating
a slot for that specific RDMA transport (it's not as if it can be
reused elsewhere).

>             nfsd-1986  [001]   123.028240:
> svc_drop:             addr=192.168.2.51:52077 xid=0x489a88f6
> flags=RQ_SECURE|RQ_USEDEFERRAL|RQ_SPLICE_OK|RQ_BUSY
>    kworker/u8:12-442   [003]   123.028242:
> rpc_task_signalled:   task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE status=0 action=rpc_prepare_task
>    kworker/u8:10-440   [000]   123.028289:
> rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=rpc_prepare_task
>    kworker/u8:10-440   [000]   123.028289:
> rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_start
>    kworker/u8:10-440   [000]   123.028291:
> rpc_request:          task:47@3 nfs4_cbv1 CB_RECALL (async)
>    kworker/u8:10-440   [000]   123.028291:
> rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_reserve
>    kworker/u8:10-440   [000]   123.028298:
> rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_reserveresult
>    kworker/u8:10-440   [000]   123.028299:
> rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_refresh
>    kworker/u8:10-440   [000]   123.028324:
> rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_refreshresult
>    kworker/u8:10-440   [000]   123.028324:
> rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_allocate
>    kworker/u8:10-440   [000]   123.028340:
> rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_encode
>    kworker/u8:10-440   [000]   123.028356:
> xprt_enq_xmit:        task:47@3 xid=0x8b95e935 seqno=0 stage=4
>    kworker/u8:10-440   [000]   123.028357:
> rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|NEED_XMIT|NEED_RECV|SIGNALLED status=0
> action=call_transmit
>    kworker/u8:10-440   [000]   123.028363:
> xprt_reserve_cong:    task:47@3 snd_task:47 cong=0 cwnd=256
>    kworker/u8:10-440   [000]   123.028365:
> xprt_transmit:        task:47@3 xid=0x8b95e935 seqno=0 status=-512
>    kworker/u8:10-440   [000]   123.028368:
> xprt_release_cong:    task:47@3 snd_task:4294967295 cong=0 cwnd=256
>    kworker/u8:10-440   [000]   123.028368:
> rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=0
> action=call_transmit_status
>    kworker/u8:10-440   [000]   123.028371:
> rpc_task_sleep:       task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=0 timeout=0
> queue=xprt_pending
>    kworker/u8:10-440   [000]   123.028376:
> rpc_task_wakeup:      task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|QUEUED|ACTIVE|NEED_RECV|SIGNALLED status=-107
> timeout=9000 queue=xprt_pending
>    kworker/u8:10-440   [000]   123.028377:
> rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=-107
> action=xprt_timer
>    kworker/u8:10-440   [000]   123.028377:
> rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=-107
> action=call_status
>    kworker/u8:10-440   [000]   123.028378:
> rpc_call_status:      task:47@3 status=-107
>    kworker/u8:10-440   [000]   123.028378:
> rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=0
> action=call_encode
>    kworker/u8:10-440   [000]   123.028380:
> xprt_enq_xmit:        task:47@3 xid=0x8b95e935 seqno=0 stage=4
>    kworker/u8:10-440   [000]   123.028380:
> rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|NEED_XMIT|NEED_RECV|SIGNALLED status=0
> action=call_transmit
>    kworker/u8:10-440   [000]   123.028381:
> xprt_reserve_cong:    task:47@3 snd_task:47 cong=0 cwnd=256
>    kworker/u8:10-440   [000]   123.028381:
> xprt_transmit:        task:47@3 xid=0x8b95e935 seqno=0 status=-512
>    kworker/u8:10-440   [000]   123.028382:
> xprt_release_cong:    task:47@3 snd_task:4294967295 cong=0 cwnd=256
>    kworker/u8:10-440   [000]   123.028382:
> rpc_task_run_action:  task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=0
> action=call_transmit_status
> 
> If we want to avoid the early RPC_IS_SIGNALLED check in the
> scheduler,
> I guess an alternative would be to have call_transmit_status() check
> for
> RPC_TASK_SIGNALLED. That's not as broad a fix, but it would address
> both the NULL and the CB looping cases, IIUC.
> 
> 
> > > > To avoid changing the behavior of call_transmit_status in the
> > > > forward direction, make backchannel RPCs return with a
> > > > different
> > > > error than ENOTCONN when they fail.
> > > > 
> > > > Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
> > > > ---
> > > > net/sunrpc/xprtrdma/svc_rdma_backchannel.c |   15 ++++++++++---
> > > > --
> > > > net/sunrpc/xprtsock.c                      |    6 ++++--
> > > > 2 files changed, 14 insertions(+), 7 deletions(-)
> > > > 
> > > > I'm playing with this fix. It seems to be required in order to
> > > > get
> > > > Kerberos mounts to work under load with NFSv4.1 and later on
> > > > RDMA.
> > > > 
> > > > If there are no objections, I can carry this for v5.7-rc ...
> > > > 
> > > > 
> > > > diff --git a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
> > > > b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
> > > > index d510a3a15d4b..b8a72d7fbcc2 100644
> > > > --- a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
> > > > +++ b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
> > > > @@ -207,11 +207,16 @@ rpcrdma_bc_send_request(struct
> > > > svcxprt_rdma
> > > > *rdma, struct rpc_rqst *rqst)
> > > > 
> > > > drop_connection:
> > > > 	dprintk("svcrdma: failed to send bc call\n");
> > > > -	return -ENOTCONN;
> > > > +	return -EHOSTUNREACH;
> > > > }
> > > > 
> > > > -/* Send an RPC call on the passive end of a transport
> > > > - * connection.
> > > > +/**
> > > > + * xprt_rdma_bc_send_request - send an RPC backchannel Call
> > > > + * @rqst: RPC Call in rq_snd_buf
> > > > + *
> > > > + * Returns:
> > > > + *	%0 if the RPC message has been sent
> > > > + *	%-EHOSTUNREACH if the Call could not be sent
> > > > */
> > > > static int
> > > > xprt_rdma_bc_send_request(struct rpc_rqst *rqst)
> > > > @@ -225,11 +230,11 @@ xprt_rdma_bc_send_request(struct rpc_rqst
> > > > *rqst)
> > > > 
> > > > 	mutex_lock(&sxprt->xpt_mutex);
> > > > 
> > > > -	ret = -ENOTCONN;
> > > > +	ret = -EHOSTUNREACH;
> > > > 	rdma = container_of(sxprt, struct svcxprt_rdma,
> > > > sc_xprt);
> > > > 	if (!test_bit(XPT_DEAD, &sxprt->xpt_flags)) {
> > > > 		ret = rpcrdma_bc_send_request(rdma, rqst);
> > > > -		if (ret == -ENOTCONN)
> > > > +		if (ret < 0)
> > > > 			svc_close_xprt(sxprt);
> > > > 	}
> > > > 
> > > > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> > > > index 17cb902e5153..92a358fd2ff0 100644
> > > > --- a/net/sunrpc/xprtsock.c
> > > > +++ b/net/sunrpc/xprtsock.c
> > > > @@ -2543,7 +2543,9 @@ static int bc_sendto(struct rpc_rqst
> > > > *req)
> > > > 	req->rq_xtime = ktime_get();
> > > > 	err = xprt_sock_sendmsg(transport->sock, &msg, xdr, 0,
> > > > marker,
> > > > &sent);
> > > > 	xdr_free_bvec(xdr);
> > > > -	if (err < 0 || sent != (xdr->len + sizeof(marker)))
> > > > +	if (err < 0)
> > > > +		return -EHOSTUNREACH;
> > > > +	if (sent != (xdr->len + sizeof(marker)))
> > > > 		return -EAGAIN;
> > > > 	return sent;
> > > > }
> > > > @@ -2567,7 +2569,7 @@ static int bc_send_request(struct
> > > > rpc_rqst
> > > > *req)
> > > > 	 */
> > > > 	mutex_lock(&xprt->xpt_mutex);
> > > > 	if (test_bit(XPT_DEAD, &xprt->xpt_flags))
> > > > -		len = -ENOTCONN;
> > > > +		len = -EHOSTUNREACH;
> > > > 	else
> > > > 		len = bc_sendto(req);
> > > > 	mutex_unlock(&xprt->xpt_mutex);
> > > > 
> > > -- 
> > > Trond Myklebust
> > > Linux NFS client maintainer, Hammerspace
> > > trond.myklebust@xxxxxxxxxxxxxxx
> > 
> > --
> > Chuck Lever
> 
> --
> Chuck Lever
> 
> 
> 
-- 
Trond Myklebust
CTO, Hammerspace Inc
4300 El Camino Real, Suite 105
Los Altos, CA 94022
www.hammer.space





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux