Re: [PATCH] Fix regression in NFSRDMA server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 28, 2014 at 07:11:56PM -0500, Tom Tucker wrote:
> Hi Bruce,
> 
> On 3/28/14 4:26 PM, J. Bruce Fields wrote:
> >On Fri, Mar 28, 2014 at 10:21:27AM -0500, Tom Tucker wrote:
> >>Hi Bruce,
> >>
> >>On 3/27/14 9:08 PM, J. Bruce Fields wrote:
> >>>On Tue, Mar 25, 2014 at 03:14:57PM -0500, Steve Wise wrote:
> >>>>From: Tom Tucker <tom@xxxxxx>
> >>>>
> >>>>The server regression was caused by the addition of rq_next_page
> >>>>(afc59400d6c65bad66d4ad0b2daf879cbff8e23e). There were a few places that
> >>>>were missed with the update of the rq_respages array.
> >>>Apologies.  (But, it could happen again--could we set up some regular
> >>>testing?  It doesn't have to be anything fancy, just cthon over
> >>>rdma--really, just read and write over rdma--would probably catch a
> >>>lot.)
> >>I think Chelsio is going to be adding some NFSRDMA regression
> >>testing to their system test.
> >OK.  Do you know who there is setting that up?  I'd be curious exactly
> >what kernels they intend to test and how they plan to report results.
> >
> 
> I don't know, Steve can weigh in on this...
> 
> >>>Also: I don't get why all these rq_next_page initializations are
> >>>required.  Why isn't the initialization at the top of svc_process()
> >>>enough?  Is rdma using it before we get to that point?  The only use of
> >>>it I see off hand is in the while loop that you're deleting.
> >>I didn't apply tremendous deductive powers here, I just added
> >>updates to rq_next_page wherever the transport messed with
> >>rq_respages. That said, NFS WRITE is likely the culprit since the
> >>write is completed as a deferral and therefore the request doesn't
> >>go through svc_process, so if rq_next_page is bogus, the cleanup
> >>will free/re-use pages that are actually in use by the transport.
> >Ugh, OK, without tracing through the code I guess I can see how that
> >would happen.  Remind me why it's using deferrals?
> 
> The server fetches the write data from the client using RDMA READ.
> So the request says ... "here's where the data is in my memory", and
> then the server issues an RDMA READ to fetch it. When the read
> completes, the deferred request is completed.

That makes sense, but maybe I'm not sure what you mean by deferring.

The tcp code can also receive a request over multiple recvfroms.  See
Trond's hack in 31d68ef65c7d4 "SUNRPC: Don't wait for full record to
receive tcp data".

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux