On Wed, Nov 04, 2020 at 11:52:55AM -0400, Jason Gunthorpe wrote: > It could work, I think a resonable ULP API would be to have some > > rdma_fill_ib_sge_from_sgl() > rdma_map_sge_single() > etc etc > > ie instead of wrappering the DMA API as-is we have a new API that > directly builds the ib_sge. It always fills the local_dma_lkey from > the pd, so it knows it is doing DMA from local kernel memory. Yeah. > Logically SW devices then have a local_dma_lkey MR that has an IOVA of > the CPU physical address space, not the DMA address space as HW > devices have. The ib_sge builders can know this detail and fill in > addr from either a cpu phyical or a dma map. I don't think the builders are the right place to do it - it really should to be in the low-level drivers for a bunch of reasons: 1) this avoids doing the dma_map when no DMA is performed, e.g. for mlx5 when send data is in the extended WQE 2) to deal with the fact that dma mapping reduces the number of SGEs. When the system uses a modern IOMMU we'll always end up with a single IOVA range no matter how many pages were mapped originally. This means any MR process can actually be consolidated to use a single SGE with the local lkey. Note that 2 implies a somewhat more complicated API, where the ULP attempts to create a MR, but the core/driver will tell it that it didn't need a MR at all.