Commit 3832591e6fa5 ("SUNRPC: Handle connection issues correctly on the back channel") intended to make backchannel RPCs fail immediately when there is no forward channel connection. What's currently happening is, when the forward channel conneciton goes away, backchannel operations are causing hard loops because call_transmit_status's SOFTCONN logic ignores ENOTCONN. To avoid changing the behavior of call_transmit_status in the forward direction, make backchannel RPCs return with a different error than ENOTCONN when they fail. Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx> --- net/sunrpc/xprtrdma/svc_rdma_backchannel.c | 15 ++++++++++----- net/sunrpc/xprtsock.c | 6 ++++-- 2 files changed, 14 insertions(+), 7 deletions(-) I'm playing with this fix. It seems to be required in order to get Kerberos mounts to work under load with NFSv4.1 and later on RDMA. If there are no objections, I can carry this for v5.7-rc ... diff --git a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c index d510a3a15d4b..b8a72d7fbcc2 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c +++ b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c @@ -207,11 +207,16 @@ rpcrdma_bc_send_request(struct svcxprt_rdma *rdma, struct rpc_rqst *rqst) drop_connection: dprintk("svcrdma: failed to send bc call\n"); - return -ENOTCONN; + return -EHOSTUNREACH; } -/* Send an RPC call on the passive end of a transport - * connection. +/** + * xprt_rdma_bc_send_request - send an RPC backchannel Call + * @rqst: RPC Call in rq_snd_buf + * + * Returns: + * %0 if the RPC message has been sent + * %-EHOSTUNREACH if the Call could not be sent */ static int xprt_rdma_bc_send_request(struct rpc_rqst *rqst) @@ -225,11 +230,11 @@ xprt_rdma_bc_send_request(struct rpc_rqst *rqst) mutex_lock(&sxprt->xpt_mutex); - ret = -ENOTCONN; + ret = -EHOSTUNREACH; rdma = container_of(sxprt, struct svcxprt_rdma, sc_xprt); if (!test_bit(XPT_DEAD, &sxprt->xpt_flags)) { ret = rpcrdma_bc_send_request(rdma, rqst); - if (ret == -ENOTCONN) + if (ret < 0) svc_close_xprt(sxprt); } diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index 17cb902e5153..92a358fd2ff0 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -2543,7 +2543,9 @@ static int bc_sendto(struct rpc_rqst *req) req->rq_xtime = ktime_get(); err = xprt_sock_sendmsg(transport->sock, &msg, xdr, 0, marker, &sent); xdr_free_bvec(xdr); - if (err < 0 || sent != (xdr->len + sizeof(marker))) + if (err < 0) + return -EHOSTUNREACH; + if (sent != (xdr->len + sizeof(marker))) return -EAGAIN; return sent; } @@ -2567,7 +2569,7 @@ static int bc_send_request(struct rpc_rqst *req) */ mutex_lock(&xprt->xpt_mutex); if (test_bit(XPT_DEAD, &xprt->xpt_flags)) - len = -ENOTCONN; + len = -EHOSTUNREACH; else len = bc_sendto(req); mutex_unlock(&xprt->xpt_mutex);