Re: [PATCH 2.6.30] xprtrdma: The frmr iova_start values are truncated by the nfs rdma client.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Tom Talpey wrote:
At 03:50 PM 4/27/2009, Steve Wise wrote:
Tom Talpey wrote:
At 03:32 PM 4/27/2009, Steve Wise wrote:
Trond Myklebust wrote:
On Mon, 2009-04-27 at 14:05 -0400, Trond Myklebust wrote:
It looks looks as though the bug is really that the IB code is using a
u64 to store dma handles. As an external user of the IB api, we really
shouldn't have to perform this sort of transformation. If it is
absolutely necessary, then it should be done by means of specialised
accessor functions to initialise/read iova_start value when given a
dma_addr_t.

I'd therefore prefer the no-cast version (with eventual compiler
warnings), in the hope that eventually the IB folks will fix their
interface.
Translation: It looks to me as if the interface that we're using is a
bit too corrupted with IB low level implementation grime. In the future,
I'd like to see someone come up with a more high level interface for use
by external code such as the sunrpc module.

Clarification: The iova_start isn't used to store dma handles. The
Agreed, it's more of a hardware register, that ends up on the wire as well.

I think the net of this is that the mr_dma should have a more sensible
up-cast that yields the right bits in the iova_start. Maybe a nice
machine-dependent macro, defined in the RDMA layer, would be a good
approach. Surely the other upper layers need it too.

While I have the floor, why doesn't the server have this issue? Looking
at the code, it has the same (unsigned long) cast as the client when
initializing its iova_start.

The server isn't using the dma address as the iova_start, but rather a kernel virtual address pointer, which is 32b on a i386 system. If you take the cast off, then the the signed bit gets extended into the u64. Apparently pointers are signed?

Why is the server using a u64 to store a naked pointer? That has to be
a bug. Casting to (unsigned long) is just hiding it.


That is what it wants to use as the registration for its frmr, which in this case is used as the source of an RDMA Write.


Does this address get handed to the RNIC to perform some sort of local
DMA?

No.
If so, how does it work if there's an IOMMU in the system? The
kva isn't necessarily the same as the dma_addr, right?


Correct. This kva is used as the iova_start for the fast-registered memory region. All it is used for is to mark the base value for "addresses" passed in via sge entries in the work requests, and also for incoming "addresses" in rdma packets. So you can use the kva when you fastreg the mr, and then also use the kva + any offset in the sge entries of your work requests that utilize it. Additionally, you can advertise the fastreg rkey, iova_start, and length to the peer for doing rdma into that region. The HW will validate any SGE entry in and any incoming rdma packets to ensure that the rkey/addr/len in the sge/packet is within the bounds of the fastregmr. Namely that the sge/packet address and length fall within the iova_start and iova_start+fastreg_len.


BTW, pointers are unsigned, but the assignment to u64 causes the
compiler to convert the pointer into a ptrdiff_t, in effect evaluating
((pointer) - NULL). Then, since the ptrdiff_t is a signed 32 bits, the
promotion results in the sign extension. I think! IOW, bug.


I see.  And that's why the cast is needed for the server side.


Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux