RE: [PATCH] Fix regression in NFSRDMA server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



+Indranil

Indranil Choudhury is the QA contact.  

Steve
> -----Original Message-----
> From: J. Bruce Fields [mailto:bfields@xxxxxxxxxxxx]
> Sent: Friday, March 28, 2014 4:27 PM
> To: Tom Tucker
> Cc: Steve Wise; trond.myklebust@xxxxxxxxxxxxxxx; linux-nfs@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH] Fix regression in NFSRDMA server
> 
> On Fri, Mar 28, 2014 at 10:21:27AM -0500, Tom Tucker wrote:
> > Hi Bruce,
> >
> > On 3/27/14 9:08 PM, J. Bruce Fields wrote:
> > >On Tue, Mar 25, 2014 at 03:14:57PM -0500, Steve Wise wrote:
> > >>From: Tom Tucker <tom@xxxxxx>
> > >>
> > >>The server regression was caused by the addition of rq_next_page
> > >>(afc59400d6c65bad66d4ad0b2daf879cbff8e23e). There were a few places that
> > >>were missed with the update of the rq_respages array.
> > >Apologies.  (But, it could happen again--could we set up some regular
> > >testing?  It doesn't have to be anything fancy, just cthon over
> > >rdma--really, just read and write over rdma--would probably catch a
> > >lot.)
> >
> > I think Chelsio is going to be adding some NFSRDMA regression
> > testing to their system test.
> 
> OK.  Do you know who there is setting that up?  I'd be curious exactly
> what kernels they intend to test and how they plan to report results.
> 
> > >Also: I don't get why all these rq_next_page initializations are
> > >required.  Why isn't the initialization at the top of svc_process()
> > >enough?  Is rdma using it before we get to that point?  The only use of
> > >it I see off hand is in the while loop that you're deleting.
> >
> > I didn't apply tremendous deductive powers here, I just added
> > updates to rq_next_page wherever the transport messed with
> > rq_respages. That said, NFS WRITE is likely the culprit since the
> > write is completed as a deferral and therefore the request doesn't
> > go through svc_process, so if rq_next_page is bogus, the cleanup
> > will free/re-use pages that are actually in use by the transport.
> 
> Ugh, OK, without tracing through the code I guess I can see how that
> would happen.  Remind me why it's using deferrals?
> 
> Applying the patch.
> 
> --b.
> 
> >
> > Tom
> > >--b.
> > >
> > >>Signed-off-by: Tom Tucker <tom@xxxxxx>
> > >>Tested-by: Steve Wise <swise@xxxxxx>
> > >>---
> > >>
> > >>  net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |   12 ++++--------
> > >>  net/sunrpc/xprtrdma/svc_rdma_sendto.c   |    1 +
> > >>  2 files changed, 5 insertions(+), 8 deletions(-)
> > >>
> > >>diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > >>index 0ce7552..8d904e4 100644
> > >>--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > >>+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > >>@@ -90,6 +90,7 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,
> > >>  		sge_no++;
> > >>  	}
> > >>  	rqstp->rq_respages = &rqstp->rq_pages[sge_no];
> > >>+	rqstp->rq_next_page = rqstp->rq_respages + 1;
> > >>  	/* We should never run out of SGE because the limit is defined to
> > >>  	 * support the max allowed RPC data length
> > >>@@ -169,6 +170,7 @@ static int map_read_chunks(struct svcxprt_rdma *xprt,
> > >>  		 */
> > >>  		head->arg.pages[page_no] = rqstp->rq_arg.pages[page_no];
> > >>  		rqstp->rq_respages = &rqstp->rq_arg.pages[page_no+1];
> > >>+		rqstp->rq_next_page = rqstp->rq_respages + 1;
> > >>  		byte_count -= sge_bytes;
> > >>  		ch_bytes -= sge_bytes;
> > >>@@ -276,6 +278,7 @@ static int fast_reg_read_chunks(struct svcxprt_rdma *xprt,
> > >>  	/* rq_respages points one past arg pages */
> > >>  	rqstp->rq_respages = &rqstp->rq_arg.pages[page_no];
> > >>+	rqstp->rq_next_page = rqstp->rq_respages + 1;
> > >>  	/* Create the reply and chunk maps */
> > >>  	offset = 0;
> > >>@@ -520,13 +523,6 @@ next_sge:
> > >>  	for (ch_no = 0; &rqstp->rq_pages[ch_no] < rqstp->rq_respages; ch_no++)
> > >>  		rqstp->rq_pages[ch_no] = NULL;
> > >>-	/*
> > >>-	 * Detach res pages. If svc_release sees any it will attempt to
> > >>-	 * put them.
> > >>-	 */
> > >>-	while (rqstp->rq_next_page != rqstp->rq_respages)
> > >>-		*(--rqstp->rq_next_page) = NULL;
> > >>-
> > >>  	return err;
> > >>  }
> > >>@@ -550,7 +546,7 @@ static int rdma_read_complete(struct svc_rqst *rqstp,
> > >>  	/* rq_respages starts after the last arg page */
> > >>  	rqstp->rq_respages = &rqstp->rq_arg.pages[page_no];
> > >>-	rqstp->rq_next_page = &rqstp->rq_arg.pages[page_no];
> > >>+	rqstp->rq_next_page = rqstp->rq_respages + 1;
> > >>  	/* Rebuild rq_arg head and tail. */
> > >>  	rqstp->rq_arg.head[0] = head->arg.head[0];
> > >>diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> > >>index c1d124d..11e90f8 100644
> > >>--- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> > >>+++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> > >>@@ -625,6 +625,7 @@ static int send_reply(struct svcxprt_rdma *rdma,
> > >>  		if (page_no+1 >= sge_no)
> > >>  			ctxt->sge[page_no+1].length = 0;
> > >>  	}
> > >>+	rqstp->rq_next_page = rqstp->rq_respages + 1;
> > >>  	BUG_ON(sge_no > rdma->sc_max_sge);
> > >>  	memset(&send_wr, 0, sizeof send_wr);
> > >>  	ctxt->wr_op = IB_WR_SEND;
> > >>
> > >--
> > >To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > >the body of a message to majordomo@xxxxxxxxxxxxxxx
> > >More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux