Re: [PATCH v3] SUNRPC: Fix a race in the receive code path

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2017-12-14 at 15:59 -0500, Chuck Lever wrote:
> > On Dec 14, 2017, at 3:37 PM, Trond Myklebust <trondmy@primarydata.c
> > om> wrote:
> > 
> > On Thu, 2017-12-14 at 14:22 -0500, Chuck Lever wrote:
> > > > On Dec 14, 2017, at 2:03 PM, Trond Myklebust <trondmy@primaryda
> > > > ta.c
> > > > om> wrote:
> > > > 
> > > > Does the RDMA code update the connect cookie when the
> > > > connection
> > > > breaks? It looks to me as if it only does that when the
> > > > connection
> > > > is
> > > > re-established. We really want both.
> > > > 
> > > > > I imagine this issue could also impact write buffer
> > > > > exhaustion
> > > > > on TCP.
> > > > 
> > > > See net/sunrpc/xprtsock.c:xs_tcp_state_change()
> > > 
> > > xprtrdma manipulates the connect_cookie in its connect worker,
> > > see rpcrdma_connect_worker. This was added by:
> > > 
> > > commit 575448bd36208f99fe0dd554a43518d798966740
> > > Author:     Tom Talpey <talpey@xxxxxxxxxx>
> > > AuthorDate: Thu Oct 9 15:00:40 2008 -0400
> > > Commit:     Trond Myklebust <Trond.Myklebust@xxxxxxxxxx>
> > > CommitDate: Fri Oct 10 15:10:36 2008 -0400
> > > 
> > >    RPC/RDMA: suppress retransmit on RPC/RDMA clients.
> > > 
> > > Would it be more correct to bump the cookie in
> > > rpcrdma_conn_upcall,
> > > which is the equivalent to xs_tcp_state_change? (if so, why, so
> > > I can compose a reasonable patch description)
> > > 
> > > It could be bumped in the RDMA_CM_EVENT_ESTABLISHED and the
> > > RDMA_CM_EVENT_DISCONNECTED cases, for example. I'm not sure
> > > RDMA provides a distinction between "server disconnected"
> > > and "client disconnected" although that probably does not
> > > matter for this purpose.
> > > 
> > > But, why would the additional cookie update help? The transport
> > > is not disconnecting before the deadlock.
> > > 
> > 
> > The connection cookie's purpose is twofold:
> > 
> > 1) It tracks whether or not a request has been transmitted on the
> > current connection or not.
> 
> That's broken by setting the cookie unconditionally outside
> the transport_lock, isn't it?
> 
> 
> > 2) It ensures that when several requests with the same connection
> > cookie all call xprt_conditional_disconnect(), then that results in
> > a
> > single disconnection event. To do so, it assumes that
> > xprt_autoclose()
> > will change the cookie if the disconnection attempt is successful.
> > 
> > In TCP we do so in the xs_tcp_state_change(). If the RDMA transport
> > can
> > guarantee that the call to xprt->ops->close(xprt) is always
> > successful,
> > then you could do so there.
> 
> I don't mind moving the cookie bump to rpcrdma_conn_upcall,
> but I'm not sure I understand the locking requirements.
> 
> Currently, xprt_transmit sets the connect_cookie while holding
> the transport_lock.
> 
> xprt_conditional_disconnect compares the cookie while holding
> the transport_lock.
> 
> For TCP, the transport_lock is held when bumping the cookie
> in the ESTABLISHED case, but _not_ in the two CLOSE cases?

That should be OK. The networking layer should provide sufficient
serialisation that we don't have to worry about collisions.

> 
> xprtrdma holds the transport_lock when bumping the cookie,
> which it does in its connect worker. It has to hold the lock
> because it skips the value 0. xprtrdma needs to guarantee
> that an RPC is never transmitted on the same connection
> twice (and maybe it could use rq_connect_cookie instead of
> its own cookie).
> 
> xprt_reserve_init is holding the reserve_lock but not the
> transport_lock when it grabs the cookie. Maybe it should
> not be initializing the rqst's cookie there?
> 
> Seems to me that xprt_transmit needs to update the rqst's
> cookie while holding the transport_lock, especially if
> xprtrdma needs to skip a cookie value? I'm sure I'm missing
> something.
> 

It should be OK, given that the connection is a state machine.
However, I missed something that you said earlier about
xprt_prepare_transmit().

OK. How about the following fixup patch instead of the earlier one?

8<---------------------------------------------------
>From 21cdb2802d9d8b71553998e6be5aafeff0742142 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
Date: Thu, 14 Dec 2017 07:05:27 -0500
Subject: [PATCH] fixup! SUNRPC: Fix a race in the receive code path

---
 net/sunrpc/xprt.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index 5e4278e9ce37..33b74fd84051 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -1001,6 +1001,7 @@ void xprt_transmit(struct rpc_task *task)
 {
 	struct rpc_rqst	*req = task->tk_rqstp;
 	struct rpc_xprt	*xprt = req->rq_xprt;
+	unsigned int connect_cookie;
 	int status, numreqs;
 
 	dprintk("RPC: %5u xprt_transmit(%u)\n", task->tk_pid, req->rq_slen);
@@ -1024,7 +1025,7 @@ void xprt_transmit(struct rpc_task *task)
 	} else if (!req->rq_bytes_sent)
 		return;
 
-	req->rq_connect_cookie = xprt->connect_cookie;
+	connect_cookie = xprt->connect_cookie;
 	req->rq_xtime = ktime_get();
 	status = xprt->ops->send_request(task);
 	trace_xprt_transmit(xprt, req->rq_xid, status);
@@ -1050,6 +1051,7 @@ void xprt_transmit(struct rpc_task *task)
 	xprt->stat.pending_u += xprt->pending.qlen;
 	spin_unlock_bh(&xprt->transport_lock);
 
+	req->rq_connect_cookie = connect_cookie;
 	if (rpc_reply_expected(task) && !READ_ONCE(req->rq_reply_bytes_recvd)) {
 		/*
 		 * Sleep on the pending queue if we're expecting a reply.
-- 
2.14.3

-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@xxxxxxxxxxxxxxx
��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux