On Fri, 2020-04-03 at 15:42 -0400, Chuck Lever wrote: > Commit 3832591e6fa5 ("SUNRPC: Handle connection issues correctly on > the back channel") intended to make backchannel RPCs fail > immediately when there is no forward channel connection. What's > currently happening is, when the forward channel conneciton goes > away, backchannel operations are causing hard loops because > call_transmit_status's SOFTCONN logic ignores ENOTCONN. Won't RPC_TASK_NOCONNECT do the right thing? It should cause the request to exit with an ENOTCONN error when it hits call_connect(). > > To avoid changing the behavior of call_transmit_status in the > forward direction, make backchannel RPCs return with a different > error than ENOTCONN when they fail. > > Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx> > --- > net/sunrpc/xprtrdma/svc_rdma_backchannel.c | 15 ++++++++++----- > net/sunrpc/xprtsock.c | 6 ++++-- > 2 files changed, 14 insertions(+), 7 deletions(-) > > I'm playing with this fix. It seems to be required in order to get > Kerberos mounts to work under load with NFSv4.1 and later on RDMA. > > If there are no objections, I can carry this for v5.7-rc ... > > > diff --git a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c > b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c > index d510a3a15d4b..b8a72d7fbcc2 100644 > --- a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c > +++ b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c > @@ -207,11 +207,16 @@ rpcrdma_bc_send_request(struct svcxprt_rdma > *rdma, struct rpc_rqst *rqst) > > drop_connection: > dprintk("svcrdma: failed to send bc call\n"); > - return -ENOTCONN; > + return -EHOSTUNREACH; > } > > -/* Send an RPC call on the passive end of a transport > - * connection. > +/** > + * xprt_rdma_bc_send_request - send an RPC backchannel Call > + * @rqst: RPC Call in rq_snd_buf > + * > + * Returns: > + * %0 if the RPC message has been sent > + * %-EHOSTUNREACH if the Call could not be sent > */ > static int > xprt_rdma_bc_send_request(struct rpc_rqst *rqst) > @@ -225,11 +230,11 @@ xprt_rdma_bc_send_request(struct rpc_rqst > *rqst) > > mutex_lock(&sxprt->xpt_mutex); > > - ret = -ENOTCONN; > + ret = -EHOSTUNREACH; > rdma = container_of(sxprt, struct svcxprt_rdma, sc_xprt); > if (!test_bit(XPT_DEAD, &sxprt->xpt_flags)) { > ret = rpcrdma_bc_send_request(rdma, rqst); > - if (ret == -ENOTCONN) > + if (ret < 0) > svc_close_xprt(sxprt); > } > > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c > index 17cb902e5153..92a358fd2ff0 100644 > --- a/net/sunrpc/xprtsock.c > +++ b/net/sunrpc/xprtsock.c > @@ -2543,7 +2543,9 @@ static int bc_sendto(struct rpc_rqst *req) > req->rq_xtime = ktime_get(); > err = xprt_sock_sendmsg(transport->sock, &msg, xdr, 0, marker, > &sent); > xdr_free_bvec(xdr); > - if (err < 0 || sent != (xdr->len + sizeof(marker))) > + if (err < 0) > + return -EHOSTUNREACH; > + if (sent != (xdr->len + sizeof(marker))) > return -EAGAIN; > return sent; > } > @@ -2567,7 +2569,7 @@ static int bc_send_request(struct rpc_rqst > *req) > */ > mutex_lock(&xprt->xpt_mutex); > if (test_bit(XPT_DEAD, &xprt->xpt_flags)) > - len = -ENOTCONN; > + len = -EHOSTUNREACH; > else > len = bc_sendto(req); > mutex_unlock(&xprt->xpt_mutex); > -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx