> On Jul 26, 2021, at 8:03 AM, trondmy@xxxxxxxxxx wrote: > > From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > > The xprtrdma client code currently relies on the task that initiated the > connect to hold the XPRT_LOCK for the duration of the connection > attempt. If the task is woken early, due to some other event, then that > lock could get released early. > Avoid races by using the same mechanism that the socket code uses of > transferring lock ownership to the RDMA connect worker itself. That > frees us to call rpcrdma_xprt_disconnect() directly since we're now > guaranteed exclusion w.r.t. other callers. > > Fixes: 4cf44be6f1e8 ("xprtrdma: Fix recursion into rpcrdma_xprt_disconnect()") > Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> I would like to test this change with disconnect injection before it is merged. > --- > net/sunrpc/xprt.c | 2 ++ > net/sunrpc/xprtrdma/transport.c | 11 +++++------ > 2 files changed, 7 insertions(+), 6 deletions(-) > > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c > index aae5a328b15b..b88ac8132054 100644 > --- a/net/sunrpc/xprt.c > +++ b/net/sunrpc/xprt.c > @@ -877,6 +877,7 @@ bool xprt_lock_connect(struct rpc_xprt *xprt, > spin_unlock(&xprt->transport_lock); > return ret; > } > +EXPORT_SYMBOL_GPL(xprt_lock_connect); > > void xprt_unlock_connect(struct rpc_xprt *xprt, void *cookie) > { > @@ -893,6 +894,7 @@ void xprt_unlock_connect(struct rpc_xprt *xprt, void *cookie) > spin_unlock(&xprt->transport_lock); > wake_up_bit(&xprt->state, XPRT_LOCKED); > } > +EXPORT_SYMBOL_GPL(xprt_unlock_connect); > > /** > * xprt_connect - schedule a transport connect operation > diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c > index 9c2ffc67c0fd..975aef16ad34 100644 > --- a/net/sunrpc/xprtrdma/transport.c > +++ b/net/sunrpc/xprtrdma/transport.c > @@ -250,12 +250,9 @@ xprt_rdma_connect_worker(struct work_struct *work) > xprt->stat.connect_start; > xprt_set_connected(xprt); > rc = -EAGAIN; > - } else { > - /* Force a call to xprt_rdma_close to clean up */ > - spin_lock(&xprt->transport_lock); > - set_bit(XPRT_CLOSE_WAIT, &xprt->state); > - spin_unlock(&xprt->transport_lock); > - } > + } else > + rpcrdma_xprt_disconnect(r_xprt); > + xprt_unlock_connect(xprt, r_xprt); > xprt_wake_pending_tasks(xprt, rc); > } > > @@ -489,6 +486,8 @@ xprt_rdma_connect(struct rpc_xprt *xprt, struct rpc_task *task) > struct rpcrdma_ep *ep = r_xprt->rx_ep; > unsigned long delay; > > + WARN_ON_ONCE(!xprt_lock_connect(xprt, task, r_xprt)); > + > delay = 0; > if (ep && ep->re_connect_status != 0) { > delay = xprt_reconnect_delay(xprt); > -- > 2.31.1 > -- Chuck Lever