Re: [PATCH] SUNRPC/xprtrdma: Fix reconnection locking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Jul 26, 2021, at 8:03 AM, trondmy@xxxxxxxxxx wrote:
> 
> From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
> 
> The xprtrdma client code currently relies on the task that initiated the
> connect to hold the XPRT_LOCK for the duration of the connection
> attempt. If the task is woken early, due to some other event, then that
> lock could get released early.
> Avoid races by using the same mechanism that the socket code uses of
> transferring lock ownership to the RDMA connect worker itself. That
> frees us to call rpcrdma_xprt_disconnect() directly since we're now
> guaranteed exclusion w.r.t. other callers.
> 
> Fixes: 4cf44be6f1e8 ("xprtrdma: Fix recursion into rpcrdma_xprt_disconnect()")
> Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>

First, thanks for the clean up. I had been wondering about that,
but I haven't encountered any problems so far with the code as
it was.

Second, after applying this patch and enabling KASAN, I ran
several disconnect injection passes with an intensive software
development workload. About 500 disconnects were injected over
the test runs. I did not encounter any issues.

I inspected a trace capture of a couple injected disconnects
and did not see anything unexpected.

I don't have a server-side disconnect injection rig. I probably
should build one.

Tested-by: Chuck Lever <chuck.lever@xxxxxxxxxx>

> ---
> net/sunrpc/xprt.c               |  2 ++
> net/sunrpc/xprtrdma/transport.c | 11 +++++------
> 2 files changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> index aae5a328b15b..b88ac8132054 100644
> --- a/net/sunrpc/xprt.c
> +++ b/net/sunrpc/xprt.c
> @@ -877,6 +877,7 @@ bool xprt_lock_connect(struct rpc_xprt *xprt,
> 	spin_unlock(&xprt->transport_lock);
> 	return ret;
> }
> +EXPORT_SYMBOL_GPL(xprt_lock_connect);
> 
> void xprt_unlock_connect(struct rpc_xprt *xprt, void *cookie)
> {
> @@ -893,6 +894,7 @@ void xprt_unlock_connect(struct rpc_xprt *xprt, void *cookie)
> 	spin_unlock(&xprt->transport_lock);
> 	wake_up_bit(&xprt->state, XPRT_LOCKED);
> }
> +EXPORT_SYMBOL_GPL(xprt_unlock_connect);
> 
> /**
>  * xprt_connect - schedule a transport connect operation
> diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
> index 9c2ffc67c0fd..975aef16ad34 100644
> --- a/net/sunrpc/xprtrdma/transport.c
> +++ b/net/sunrpc/xprtrdma/transport.c
> @@ -250,12 +250,9 @@ xprt_rdma_connect_worker(struct work_struct *work)
> 					   xprt->stat.connect_start;
> 		xprt_set_connected(xprt);
> 		rc = -EAGAIN;
> -	} else {
> -		/* Force a call to xprt_rdma_close to clean up */
> -		spin_lock(&xprt->transport_lock);
> -		set_bit(XPRT_CLOSE_WAIT, &xprt->state);
> -		spin_unlock(&xprt->transport_lock);
> -	}
> +	} else
> +		rpcrdma_xprt_disconnect(r_xprt);
> +	xprt_unlock_connect(xprt, r_xprt);
> 	xprt_wake_pending_tasks(xprt, rc);
> }
> 
> @@ -489,6 +486,8 @@ xprt_rdma_connect(struct rpc_xprt *xprt, struct rpc_task *task)
> 	struct rpcrdma_ep *ep = r_xprt->rx_ep;
> 	unsigned long delay;
> 
> +	WARN_ON_ONCE(!xprt_lock_connect(xprt, task, r_xprt));
> +
> 	delay = 0;
> 	if (ep && ep->re_connect_status != 0) {
> 		delay = xprt_reconnect_delay(xprt);
> -- 
> 2.31.1
> 

--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux