Re: [PATCH] svcrdma: fix offset calculation for non-page aligned sge entries

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 12 Mar 2014 17:18:32 -0400
Jeff Layton <jlayton@xxxxxxxxxx> wrote:

> On Wed, 12 Mar 2014 15:43:10 -0500
> Tom Tucker <tom@xxxxxxxxxxxxxxxxxxxxx> wrote:
> 
> > Hi Jeff,
> > 
> > 
> > On 3/12/14 9:51 AM, Jeff Layton wrote:
> > > The xdr_off value in dma_map_xdr gets passed to ib_dma_map_page as the
> > > offset into the page to be mapped. For the case of the pages array, the
> > > existing calculation always seems to end up at 0, which causes data
> > > corruption when a non-page-aligned READ request comes in. The server ends
> > > up doing the RDMA_WRITE from the wrong part of the page.
> > >
> > > Override the xdr_off value with the page_base in that situation to fix
> > > the issue. Obviously, this method can't contend with a page_base that is
> > > larger than PAGE_SIZE.
> > I think we saw page base > PAGE_SIZE when inter-operating with Sun, but 
> > maybe someone else on the list has a clearer recollection.
> > 
> 
> Ok, I may need to rejigger that logic to account for that case. I'll do
> that and send a v2 once I test it out...
> 
> > I am curious, however, why xdr_off was always zero if in fact, page_base 
> > was not page aligned.
> > 
> 
> Mostly I was testing with sub-page sized READ requests (which are of
> course done via RDMA_WRITE). Assume we have a READ request for 512
> bytes.
> 
> On the first pass into send_write, we call dma_map_xdr to map the
> head. On that call xdr_off is 0.
> 
> On the next pass, we go to send the page data. At that point,
> xdr_off == head[0].iov_len:
> 
> -------------------8<----------------------
>         } else {
>                 xdr_off -= xdr->head[0].iov_len;
>                 if (xdr_off < xdr->page_len) {
>                         /* This offset is in the page list */
>                         page = xdr->pages[xdr_off >> PAGE_SHIFT];
>                         xdr_off &= ~PAGE_MASK;
>                 } else {
> -------------------8<----------------------
> 
> ...so we subtract xdr->head[0].iov_len, and xdr_off is now 0.
> 
> The problem is that you can't determine where in the pages array the
> data to send actually starts just from the xdr_off. That just tells you
> how far into the xdr buffer you are. If the data you actually want
> isn't aligned to the start of the page (xdr->page_base !=0) then you
> can't tell from that.
> 
> That info is tracked inside the xdr->page_base, so we need use that to
> determine what the mapping offset should be.
> 

Ok, so the above is only part of the story. It's true that I never get
an xdr_off > 0 after subtracting the head's iov_len, but only if the
requested read call doesn't span more than one page, or more than one
wc_array entry.

I went over the existing code again over the last few days and came to
the conclusion that it's basically correct, and the only thing missing
is making it add in the xdr->page_base. The patch I just sent adds that
in, and from what I can tell it does the right thing on every READ call
that I tested.

I'm still seeing panics in the write codepath, which I suspect is the
same problem that Steve Wise was looking at, but I haven't yet had a
chance to look at that closely.

-- 
Jeff Layton <jlayton@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux