> On Dec 3, 2017, at 3:12 PM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > > On Sun, 2017-12-03 at 13:54 -0500, Chuck Lever wrote: >>> On Dec 3, 2017, at 1:50 PM, Trond Myklebust <trond.myklebust@primar >>> ydata.com> wrote: >>> >>> We must ensure that the call to rpc_sleep_on() in xprt_transmit() >>> cannot >>> race with the call to xprt_complete_rqst(). >> >> :-( this will kill scalability, we might as well just go back >> to the old locking scheme. > > It shouldn't make a huge difference, but I agree that we do want to get > rid of that transport lock. > > How about the following, then? This looks better, I'll give it a try! But touching the recv_lock in the transmit path is what I'd like to avoid completely, if possible. I'm not clear on why the pending waitqueue is unprotected. Doesn't it have a lock of its own? > 8<------------------------------------------------------------------ > From 6a0c507f160d5624d9049281cd9dfe222a866f06 Mon Sep 17 00:00:00 2001 > From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > Date: Sun, 3 Dec 2017 13:37:27 -0500 > Subject: [PATCH v2] SUNRPC: Fix a race in the receive code path > > We must ensure that the call to rpc_sleep_on() in xprt_transmit() cannot > race with the call to xprt_complete_rqst(). > > Reported-by: Chuck Lever <chuck.lever@xxxxxxxxxx> > Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=317 > Fixes: ce7c252a8c74 ("SUNRPC: Add a separate spinlock to protect..") > Cc: stable@xxxxxxxxxxxxxxx # 4.14+ > Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > --- > net/sunrpc/xprt.c | 18 +++++++++++------- > 1 file changed, 11 insertions(+), 7 deletions(-) > > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c > index 333b9d697ae5..34f613385319 100644 > --- a/net/sunrpc/xprt.c > +++ b/net/sunrpc/xprt.c > @@ -1024,6 +1024,7 @@ void xprt_transmit(struct rpc_task *task) > } else if (!req->rq_bytes_sent) > return; > > + req->rq_connect_cookie = xprt->connect_cookie; > req->rq_xtime = ktime_get(); > status = xprt->ops->send_request(task); > trace_xprt_transmit(xprt, req->rq_xid, status); > @@ -1048,19 +1049,22 @@ void xprt_transmit(struct rpc_task *task) > xprt->stat.sending_u += xprt->sending.qlen; > xprt->stat.pending_u += xprt->pending.qlen; > > - /* Don't race with disconnect */ > - if (!xprt_connected(xprt)) > - task->tk_status = -ENOTCONN; > - else { > + spin_unlock_bh(&xprt->transport_lock); > + > + if (!READ_ONCE(req->rq_reply_bytes_recvd) && rpc_reply_expected(task)) { > + spin_lock(&xprt->recv_lock); > /* > * Sleep on the pending queue since > * we're expecting a reply. > */ > - if (!req->rq_reply_bytes_recvd && rpc_reply_expected(task)) > + if (!req->rq_reply_bytes_recvd) { > rpc_sleep_on(&xprt->pending, task, xprt_timer); > - req->rq_connect_cookie = xprt->connect_cookie; > + /* Deal with disconnect races */ > + if (!xprt_connected(xprt)) > + xprt_wake_pending_tasks(xprt, -ENOTCONN); > + } > + spin_unlock(&xprt->recv_lock); > } > - spin_unlock_bh(&xprt->transport_lock); > } > > static void xprt_add_backlog(struct rpc_xprt *xprt, struct rpc_task *task) > -- > 2.14.3 > > -- > Trond Myklebust > Linux NFS client maintainer, PrimaryData > trond.myklebust@xxxxxxxxxxxxxxx > N§ēæėrļyúčØbēXŽķĮ§vØ^)Þš{.nĮ+·Ĩ{ąû"Ø^nrĄöĶzËëhĻčÚ&ĒøŪGŦéhŪ(éÝĒj"úķm§ĸïęäzđÞāþfĢĒ·h§~m -- Chuck Lever chucklever@xxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html