On 9/22/2015 12:56 AM, Sagi Grimberg wrote:
On 9/22/2015 10:19 AM, Sagi Grimberg wrote:
As mentioned earlier, I have a WIP RDS fastreg branch [3]
which is functional (at least I can RDMA messages across
nodes ;-)).
Nice!
So merging [2] and [3], I created [4] and applied
a delta change based on your other patches. I saw ib_post_send
failure with my HCA driver returning '-EINVAL'. I didn't
debug it further but at least opcode and num_sge were set
correctly so I shouldn't have seen it. So I did memset()
on reg_wr which seems to have helped to fix the ib_post_send()
failure.
Yep - that was my fault. When converting the ULPs I optimized by
removing the memset but I forgot to set reg_wr.wr.next = NULL when
the ULP needed. This caused the driver to read a second bogus work
request. Steve just reported this as well so I'll fix that in v2.
Ahh, right. There can be chain of wr.
But I got into remote access errors which tells me that I
have messed up setup(rkey, sge setup or access flags)
One thing that pops is that in the old API the MR was registered
with iova_start = 0 (which is probably what was sent to the peer),
but the new API the iova is implicitly sg_dma_address(&sg[0]).
The registered MR holds these attributes in:
mr->rkey
mr->iova
mr->length
These should be passed to a peer to perform rdma.
right.
ohh, I just read the RDS 3.1 specification (for the first time..) and I
noticed that RDS 3.1 header extension contains only a 32bit offset
parameter. Why is that anyway? why not 64bit so it can be a valid mapped
address? Also the code doesn't use it at all and always passes 0 (which
is buggy if sg[0] has an offset from a page).
This won't work with the proposed API as the iova is 64bit (as all other
existing RDMA protocols use 64bit addresses).
In any event, I'd much rather to add ib_map_mr_sg_zbva() just for RDS
to use instead of polluting the API with an iova argument, but I think
that the RDS spec can be updated to use 64bit offsets and align to all
other RDMA protocols (it has enough space in h_exthdr which is 128bit).
RDS assumes it's an offset and hence it has been used as 32 bit. I need
to look through this carefully though because all the existing
application use this header format. There is also RDMA read/write
byte information sent as part of the header(Not in upstream code yet)
so the space might be less. But point taken. Will look into it.
I was thinking of:
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index e7e0251..61fcab4 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -3033,6 +3033,21 @@ int ib_map_mr_sg(struct ib_mr *mr,
unsigned int sg_nents,
unsigned int page_size);
+static inline int
+ib_map_mr_sg_zbva(struct ib_mr *mr,
+ struct scatterlist *sg,
+ unsigned int sg_nents,
+ unsigned int page_size)
+{
+ int rc;
+
+ rc = ib_map_mr_sg(mr, sg, sg_nents, page_size);
+ if (likely(!rc))
+ mr->iova &= ((u64)page_size - 1);
+
+ return rc;
+}
+
int ib_sg_to_pages(struct ib_mr *mr,
struct scatterlist *sgl,
unsigned int sg_nents,
--
Thoughts?
Santosh, can you use that one instead and let us know if
it resolves your issue?
Unfortunately this change still doesn't fix the issue.
I think you should make sure to correctly construct the
h_exthdr with: rds_rdma_make_cookie(mr->rkey, (32)mr->iova)
Will look into it. Thanks for suggestion.
Regards,
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html