On Thu, May 30, 2019 at 03:57:37PM -0300, Jason Gunthorpe wrote: > On Thu, May 30, 2019 at 09:56:09PM +0300, Yuval Shaia wrote: > > On Thu, May 30, 2019 at 06:17:21PM +0000, Jason Gunthorpe wrote: > > > On Thu, May 30, 2019 at 05:34:53PM +0300, Yuval Shaia wrote: > > > > On Thu, May 30, 2019 at 12:37:18PM +0000, Michal Kalderon wrote: > > > > > > From: linux-rdma-owner@xxxxxxxxxxxxxxx <linux-rdma- > > > > > > owner@xxxxxxxxxxxxxxx> On Behalf Of Yuval Shaia > > > > > > > > > > > > The virtual address that is registered is used as a base for any address passed > > > > > > later in post_recv and post_send operations. > > > > > > > > > > > > On a virtualized environment this is not correct. > > > > > > > > > > > > A guest cannot register its memory so hypervisor maps the guest physical > > > > > > address to a host virtual address and register it with the HW. Later on, at > > > > > > datapath phase, the guest fills the SGEs with addresses from its address > > > > > > space. > > > > > > Since HW cannot access guest virtual address space an extra translation is > > > > > > needed to map those addresses to be based on the host virtual address that > > > > > > was registered with the HW. > > > > > > This datapath interference affects performances. > > > > > > > > > > > > To avoid this, a logical separation between the address that is registered and > > > > > > the address that is used as a offset at datapath phase is needed. > > > > > > This separation is already implemented in the lower layer part > > > > > > (ibv_cmd_reg_mr) but blocked at the API level. > > > > > > > > > > > > Fix it by introducing a new API function which accepts an address from guest > > > > > > virtual address space as well, to be used as offset for later datapath > > > > > > operations. > > > > > > > > > > > Could you give an example of how an app would use this new API? How will > > > > > It receive the new hca_va addresss ? > > > > > > > > In my use case an application is device emulation that runs in the context > > > > of a userspace process in the host. > > > > This (virtual) device receives from guest driver a dma address (in form of > > > > scatter-gather list) along with guest user-space virtual address. > > > > > > How do you handle the scatter-gather list? > > > > Well, it is not exactly scatter-gather, lets think of it as an array of > > guest dma addresses, in mellanox terms it is mtt, in vmware it is page > > directory. > > So i guess your question is how to register list of scattered addresses, > > right? > > Yes.. > > I've always thought we should have an API to do that. Yeah, we had this discussion at RDMA Plumbers two years ago. Then comes a RFC from Mellanox suggesting a generic way to register non-contiguous memory but i didn't saw any progress on that. Anyway, what i did is to create an "alias" to first address using mremap system call. This new address points to a buffer of size of the sum of sizes of all buffers in the array. Then loop on the array and with a different use of mremap to map each entry to the corresponding offset in the big contiguous address. This way i have a contiguous virtual address that can be registered as MR. Hope i was able to explain the flow. If needed i can share the code. > > Jason