On Tue, May 28, 2019 at 09:30:56AM +0300, Yuval Shaia wrote: > On Mon, May 27, 2019 at 03:22:20PM -0300, Jason Gunthorpe wrote: > > On Mon, May 27, 2019 at 06:00:04PM +0300, Yuval Shaia wrote: > > > The virtual address that is registered is used as a base for any address > > > used later in post_recv and post_send operations. > > > > > > On a virtualised environment this is not correct. > > > > > > A guest cannot register its memory so hypervisor maps the guest physical > > > address to a host virtual address and register it with the HW. Later on, > > > at datapath phase, the guest fills the SGEs with addresses from its > > > address space. > > > Since HW cannot access guest virtual address space an extra translation > > > is needed to map those addresses to be based on the host virtual address > > > that was registered with the HW. > > > > > > To avoid this, a logical separation between the address that is > > > registered and the address that is used as a offset at datapath phase is > > > needed. > > > > > > This separation is already implemented in the lower layer part > > > (ibv_cmd_reg_mr) but blocked at the API level. > > > > > > Fix it by introducing a new API function that accepts a address from > > > guest virtual address space as well. > > > > > > Signed-off-by: Yuval Shaia <yuval.shaia@xxxxxxxxxx> > > > libibverbs/verbs.c | 24 ++++++++++++++++++++++++ > > > libibverbs/verbs.h | 6 ++++++ > > > 2 files changed, 30 insertions(+) > > > > > > diff --git a/libibverbs/verbs.c b/libibverbs/verbs.c > > > index 1766b9f5..9ad74ee0 100644 > > > +++ b/libibverbs/verbs.c > > > @@ -324,6 +324,30 @@ LATEST_SYMVER_FUNC(ibv_reg_mr, 1_1, "IBVERBS_1.1", > > > return mr; > > > } > > > > > > +LATEST_SYMVER_FUNC(ibv_reg_mr_virt_as, 1_1, "IBVERBS_1.1", > > > + struct ibv_mr *, > > > + struct ibv_pd *pd, void *addr, size_t length, > > > + uint64_t hca_va, int access) > > > +{ > > > > Doesn't need this macro since it doesn't have a compat version > > That is weird, without this it fails in link stage. > > /usr/bin/ld: hw/rdma/rdma_backend.o: in function `rdma_backend_create_mr': > rdma_backend.c:(.text+0x260a): undefined reference to `ibv_reg_mr_virt_as' > collect2: error: ld returned 1 exit status You do have to update the map file in any case > > > > > + struct verbs_mr *vmr; > > > + struct ibv_reg_mr cmd; > > > + struct ib_uverbs_reg_mr_resp resp; > > > + int ret; > > > + > > > + vmr = malloc(sizeof(*vmr)); > > > + if (!vmr) > > > + return NULL; > > > + > > > + ret = ibv_cmd_reg_mr(pd, addr, length, hca_va, access, vmr, &cmd, > > > + sizeof(cmd), &resp, sizeof(resp)); > > > > It seems problematic not to call the driver, several of the drivers > > are wrappering mr in their own type (ie bnxt_re_mr) and we can't just > > allocate the wrong size of memory here. > > > > What you should do is modify the existing driver callback to accept > > another argument and go and fix all the drivers to pass that argument > > into their ibv_cmd_reg_mr as the hca_va above. This looks pretty > > trivial. > > So back to what was proposed in the RFC besides the addition of new arg > instead of new callback, right? Well, half and half, keep the normal function for the entry point and change the sigature of the callback in driver.h Jason