Hi Trond, thanks for the look! > On Apr 3, 2020, at 4:00 PM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > > On Fri, 2020-04-03 at 15:42 -0400, Chuck Lever wrote: >> Commit 3832591e6fa5 ("SUNRPC: Handle connection issues correctly on >> the back channel") intended to make backchannel RPCs fail >> immediately when there is no forward channel connection. What's >> currently happening is, when the forward channel conneciton goes >> away, backchannel operations are causing hard loops because >> call_transmit_status's SOFTCONN logic ignores ENOTCONN. > > Won't RPC_TASK_NOCONNECT do the right thing? It should cause the > request to exit with an ENOTCONN error when it hits call_connect(). OK, so does that mean SOFTCONN is entirely the wrong semantic here? Was RPC_TASK_NOCONNECT available when 3832591e6fa5 was merged? >> To avoid changing the behavior of call_transmit_status in the >> forward direction, make backchannel RPCs return with a different >> error than ENOTCONN when they fail. >> >> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx> >> --- >> net/sunrpc/xprtrdma/svc_rdma_backchannel.c | 15 ++++++++++----- >> net/sunrpc/xprtsock.c | 6 ++++-- >> 2 files changed, 14 insertions(+), 7 deletions(-) >> >> I'm playing with this fix. It seems to be required in order to get >> Kerberos mounts to work under load with NFSv4.1 and later on RDMA. >> >> If there are no objections, I can carry this for v5.7-rc ... >> >> >> diff --git a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c >> b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c >> index d510a3a15d4b..b8a72d7fbcc2 100644 >> --- a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c >> +++ b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c >> @@ -207,11 +207,16 @@ rpcrdma_bc_send_request(struct svcxprt_rdma >> *rdma, struct rpc_rqst *rqst) >> >> drop_connection: >> dprintk("svcrdma: failed to send bc call\n"); >> - return -ENOTCONN; >> + return -EHOSTUNREACH; >> } >> >> -/* Send an RPC call on the passive end of a transport >> - * connection. >> +/** >> + * xprt_rdma_bc_send_request - send an RPC backchannel Call >> + * @rqst: RPC Call in rq_snd_buf >> + * >> + * Returns: >> + * %0 if the RPC message has been sent >> + * %-EHOSTUNREACH if the Call could not be sent >> */ >> static int >> xprt_rdma_bc_send_request(struct rpc_rqst *rqst) >> @@ -225,11 +230,11 @@ xprt_rdma_bc_send_request(struct rpc_rqst >> *rqst) >> >> mutex_lock(&sxprt->xpt_mutex); >> >> - ret = -ENOTCONN; >> + ret = -EHOSTUNREACH; >> rdma = container_of(sxprt, struct svcxprt_rdma, sc_xprt); >> if (!test_bit(XPT_DEAD, &sxprt->xpt_flags)) { >> ret = rpcrdma_bc_send_request(rdma, rqst); >> - if (ret == -ENOTCONN) >> + if (ret < 0) >> svc_close_xprt(sxprt); >> } >> >> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c >> index 17cb902e5153..92a358fd2ff0 100644 >> --- a/net/sunrpc/xprtsock.c >> +++ b/net/sunrpc/xprtsock.c >> @@ -2543,7 +2543,9 @@ static int bc_sendto(struct rpc_rqst *req) >> req->rq_xtime = ktime_get(); >> err = xprt_sock_sendmsg(transport->sock, &msg, xdr, 0, marker, >> &sent); >> xdr_free_bvec(xdr); >> - if (err < 0 || sent != (xdr->len + sizeof(marker))) >> + if (err < 0) >> + return -EHOSTUNREACH; >> + if (sent != (xdr->len + sizeof(marker))) >> return -EAGAIN; >> return sent; >> } >> @@ -2567,7 +2569,7 @@ static int bc_send_request(struct rpc_rqst >> *req) >> */ >> mutex_lock(&xprt->xpt_mutex); >> if (test_bit(XPT_DEAD, &xprt->xpt_flags)) >> - len = -ENOTCONN; >> + len = -EHOSTUNREACH; >> else >> len = bc_sendto(req); >> mutex_unlock(&xprt->xpt_mutex); >> > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@xxxxxxxxxxxxxxx -- Chuck Lever