> On Apr 3, 2020, at 4:05 PM, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: > > Hi Trond, thanks for the look! > >> On Apr 3, 2020, at 4:00 PM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: >> >> On Fri, 2020-04-03 at 15:42 -0400, Chuck Lever wrote: >>> Commit 3832591e6fa5 ("SUNRPC: Handle connection issues correctly on >>> the back channel") intended to make backchannel RPCs fail >>> immediately when there is no forward channel connection. What's >>> currently happening is, when the forward channel conneciton goes >>> away, backchannel operations are causing hard loops because >>> call_transmit_status's SOFTCONN logic ignores ENOTCONN. >> >> Won't RPC_TASK_NOCONNECT do the right thing? It should cause the >> request to exit with an ENOTCONN error when it hits call_connect(). > > OK, so does that mean SOFTCONN is entirely the wrong semantic here? > > Was RPC_TASK_NOCONNECT available when 3832591e6fa5 was merged? It turns out 3832591e6fa5 is not related. It's 58255a4e3ce5 that added RPC_TASK_SOFTCONN on NFSv4 callback Calls. However, the server uses nfsd4_run_cb_work() for both NFSv4.0 and NFSv4.1 callbacks. IMO a fix for this will have to take care that RPC_TASK_NOCONNECT is not set on NFSv4.0 callback tasks. Unfortunately the looping behavior appears also related to the RPC_IS_SIGNALLED bug I reported earlier this week. The CB request is signalled as it is firing up, and then drops into a hard loop. nfsd-1986 [001] 123.028240: svc_drop: addr=192.168.2.51:52077 xid=0x489a88f6 flags=RQ_SECURE|RQ_USEDEFERRAL|RQ_SPLICE_OK|RQ_BUSY kworker/u8:12-442 [003] 123.028242: rpc_task_signalled: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE status=0 action=rpc_prepare_task kworker/u8:10-440 [000] 123.028289: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=rpc_prepare_task kworker/u8:10-440 [000] 123.028289: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_start kworker/u8:10-440 [000] 123.028291: rpc_request: task:47@3 nfs4_cbv1 CB_RECALL (async) kworker/u8:10-440 [000] 123.028291: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_reserve kworker/u8:10-440 [000] 123.028298: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_reserveresult kworker/u8:10-440 [000] 123.028299: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_refresh kworker/u8:10-440 [000] 123.028324: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_refreshresult kworker/u8:10-440 [000] 123.028324: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_allocate kworker/u8:10-440 [000] 123.028340: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_encode kworker/u8:10-440 [000] 123.028356: xprt_enq_xmit: task:47@3 xid=0x8b95e935 seqno=0 stage=4 kworker/u8:10-440 [000] 123.028357: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_XMIT|NEED_RECV|SIGNALLED status=0 action=call_transmit kworker/u8:10-440 [000] 123.028363: xprt_reserve_cong: task:47@3 snd_task:47 cong=0 cwnd=256 kworker/u8:10-440 [000] 123.028365: xprt_transmit: task:47@3 xid=0x8b95e935 seqno=0 status=-512 kworker/u8:10-440 [000] 123.028368: xprt_release_cong: task:47@3 snd_task:4294967295 cong=0 cwnd=256 kworker/u8:10-440 [000] 123.028368: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=0 action=call_transmit_status kworker/u8:10-440 [000] 123.028371: rpc_task_sleep: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=0 timeout=0 queue=xprt_pending kworker/u8:10-440 [000] 123.028376: rpc_task_wakeup: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|QUEUED|ACTIVE|NEED_RECV|SIGNALLED status=-107 timeout=9000 queue=xprt_pending kworker/u8:10-440 [000] 123.028377: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=-107 action=xprt_timer kworker/u8:10-440 [000] 123.028377: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=-107 action=call_status kworker/u8:10-440 [000] 123.028378: rpc_call_status: task:47@3 status=-107 kworker/u8:10-440 [000] 123.028378: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=0 action=call_encode kworker/u8:10-440 [000] 123.028380: xprt_enq_xmit: task:47@3 xid=0x8b95e935 seqno=0 stage=4 kworker/u8:10-440 [000] 123.028380: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_XMIT|NEED_RECV|SIGNALLED status=0 action=call_transmit kworker/u8:10-440 [000] 123.028381: xprt_reserve_cong: task:47@3 snd_task:47 cong=0 cwnd=256 kworker/u8:10-440 [000] 123.028381: xprt_transmit: task:47@3 xid=0x8b95e935 seqno=0 status=-512 kworker/u8:10-440 [000] 123.028382: xprt_release_cong: task:47@3 snd_task:4294967295 cong=0 cwnd=256 kworker/u8:10-440 [000] 123.028382: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=0 action=call_transmit_status If we want to avoid the early RPC_IS_SIGNALLED check in the scheduler, I guess an alternative would be to have call_transmit_status() check for RPC_TASK_SIGNALLED. That's not as broad a fix, but it would address both the NULL and the CB looping cases, IIUC. >>> To avoid changing the behavior of call_transmit_status in the >>> forward direction, make backchannel RPCs return with a different >>> error than ENOTCONN when they fail. >>> >>> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx> >>> --- >>> net/sunrpc/xprtrdma/svc_rdma_backchannel.c | 15 ++++++++++----- >>> net/sunrpc/xprtsock.c | 6 ++++-- >>> 2 files changed, 14 insertions(+), 7 deletions(-) >>> >>> I'm playing with this fix. It seems to be required in order to get >>> Kerberos mounts to work under load with NFSv4.1 and later on RDMA. >>> >>> If there are no objections, I can carry this for v5.7-rc ... >>> >>> >>> diff --git a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c >>> b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c >>> index d510a3a15d4b..b8a72d7fbcc2 100644 >>> --- a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c >>> +++ b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c >>> @@ -207,11 +207,16 @@ rpcrdma_bc_send_request(struct svcxprt_rdma >>> *rdma, struct rpc_rqst *rqst) >>> >>> drop_connection: >>> dprintk("svcrdma: failed to send bc call\n"); >>> - return -ENOTCONN; >>> + return -EHOSTUNREACH; >>> } >>> >>> -/* Send an RPC call on the passive end of a transport >>> - * connection. >>> +/** >>> + * xprt_rdma_bc_send_request - send an RPC backchannel Call >>> + * @rqst: RPC Call in rq_snd_buf >>> + * >>> + * Returns: >>> + * %0 if the RPC message has been sent >>> + * %-EHOSTUNREACH if the Call could not be sent >>> */ >>> static int >>> xprt_rdma_bc_send_request(struct rpc_rqst *rqst) >>> @@ -225,11 +230,11 @@ xprt_rdma_bc_send_request(struct rpc_rqst >>> *rqst) >>> >>> mutex_lock(&sxprt->xpt_mutex); >>> >>> - ret = -ENOTCONN; >>> + ret = -EHOSTUNREACH; >>> rdma = container_of(sxprt, struct svcxprt_rdma, sc_xprt); >>> if (!test_bit(XPT_DEAD, &sxprt->xpt_flags)) { >>> ret = rpcrdma_bc_send_request(rdma, rqst); >>> - if (ret == -ENOTCONN) >>> + if (ret < 0) >>> svc_close_xprt(sxprt); >>> } >>> >>> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c >>> index 17cb902e5153..92a358fd2ff0 100644 >>> --- a/net/sunrpc/xprtsock.c >>> +++ b/net/sunrpc/xprtsock.c >>> @@ -2543,7 +2543,9 @@ static int bc_sendto(struct rpc_rqst *req) >>> req->rq_xtime = ktime_get(); >>> err = xprt_sock_sendmsg(transport->sock, &msg, xdr, 0, marker, >>> &sent); >>> xdr_free_bvec(xdr); >>> - if (err < 0 || sent != (xdr->len + sizeof(marker))) >>> + if (err < 0) >>> + return -EHOSTUNREACH; >>> + if (sent != (xdr->len + sizeof(marker))) >>> return -EAGAIN; >>> return sent; >>> } >>> @@ -2567,7 +2569,7 @@ static int bc_send_request(struct rpc_rqst >>> *req) >>> */ >>> mutex_lock(&xprt->xpt_mutex); >>> if (test_bit(XPT_DEAD, &xprt->xpt_flags)) >>> - len = -ENOTCONN; >>> + len = -EHOSTUNREACH; >>> else >>> len = bc_sendto(req); >>> mutex_unlock(&xprt->xpt_mutex); >>> >> -- >> Trond Myklebust >> Linux NFS client maintainer, Hammerspace >> trond.myklebust@xxxxxxxxxxxxxxx > > -- > Chuck Lever -- Chuck Lever