On Wed, Mar 12, 2025 at 10:28:32AM +0100, Marek Szyprowski wrote: > Hi Robin > > On 28.02.2025 20:54, Robin Murphy wrote: > > On 20/02/2025 12:48 pm, Leon Romanovsky wrote: > >> On Wed, Feb 05, 2025 at 04:40:20PM +0200, Leon Romanovsky wrote: > >>> From: Leon Romanovsky <leonro@xxxxxxxxxx> > >>> > >>> Changelog: > >>> v7: > >>> * Rebased to v6.14-rc1 > >> > >> <...> > >> > >>> Christoph Hellwig (6): > >>> PCI/P2PDMA: Refactor the p2pdma mapping helpers > >>> dma-mapping: move the PCI P2PDMA mapping helpers to pci-p2pdma.h > >>> iommu: generalize the batched sync after map interface > >>> iommu/dma: Factor out a iommu_dma_map_swiotlb helper > >>> dma-mapping: add a dma_need_unmap helper > >>> docs: core-api: document the IOVA-based API > >>> > >>> Leon Romanovsky (11): > >>> iommu: add kernel-doc for iommu_unmap and iommu_unmap_fast > >>> dma-mapping: Provide an interface to allow allocate IOVA > >>> dma-mapping: Implement link/unlink ranges API > >>> mm/hmm: let users to tag specific PFN with DMA mapped bit > >>> mm/hmm: provide generic DMA managing logic > >>> RDMA/umem: Store ODP access mask information in PFN > >>> RDMA/core: Convert UMEM ODP DMA mapping to caching IOVA and page > >>> linkage > >>> RDMA/umem: Separate implicit ODP initialization from explicit ODP > >>> vfio/mlx5: Explicitly use number of pages instead of allocated > >>> length > >>> vfio/mlx5: Rewrite create mkey flow to allow better code reuse > >>> vfio/mlx5: Enable the DMA link API > >>> > >>> Documentation/core-api/dma-api.rst | 70 ++++ > >> drivers/infiniband/core/umem_odp.c | 250 +++++--------- > >>> drivers/infiniband/hw/mlx5/mlx5_ib.h | 12 +- > >>> drivers/infiniband/hw/mlx5/odp.c | 65 ++-- > >>> drivers/infiniband/hw/mlx5/umr.c | 12 +- > >>> drivers/iommu/dma-iommu.c | 468 > >>> +++++++++++++++++++++++---- > >>> drivers/iommu/iommu.c | 84 ++--- > >>> drivers/pci/p2pdma.c | 38 +-- > >>> drivers/vfio/pci/mlx5/cmd.c | 375 +++++++++++---------- > >>> drivers/vfio/pci/mlx5/cmd.h | 35 +- > >>> drivers/vfio/pci/mlx5/main.c | 87 +++-- > >>> include/linux/dma-map-ops.h | 54 ---- > >>> include/linux/dma-mapping.h | 85 +++++ > >>> include/linux/hmm-dma.h | 33 ++ > >>> include/linux/hmm.h | 21 ++ > >>> include/linux/iommu.h | 4 + > >>> include/linux/pci-p2pdma.h | 84 +++++ > >>> include/rdma/ib_umem_odp.h | 25 +- > >>> kernel/dma/direct.c | 44 +-- > >>> kernel/dma/mapping.c | 18 ++ > >>> mm/hmm.c | 264 +++++++++++++-- > >>> 21 files changed, 1435 insertions(+), 693 deletions(-) > >>> create mode 100644 include/linux/hmm-dma.h > >> > >> Kind reminder. <...> > Removing the need for scatterlists was advertised as the main goal of > this new API, but it looks that similar effects can be achieved with > just iterating over the pages and calling page-based DMA API directly. Such iteration can't be enough because P2P pages don't have struct pages, so you can't use reliably and efficiently dma_map_page_attrs() call. The only way to do so is to use dma_map_sg_attrs(), which relies on SG (the one that we want to remove) to map P2P pages. > Maybe I missed something. I still see some advantages in this DMA API > extension, but I would also like to see the clear benefits from > introducing it, like perf logs or other benchmark summary. We didn't focus yet on performance, however Christoph mentioned in his block RFC [1] that even simple conversion should improve performance as we are performing one P2P lookup per-bio and not per-SG entry as was before [2]. In addition it decreases memory [3] too. [1] https://lore.kernel.org/all/cover.1730037261.git.leon@xxxxxxxxxx/ [2] https://lore.kernel.org/all/34d44537a65aba6ede215a8ad882aeee028b423a.1730037261.git.leon@xxxxxxxxxx/ [3] https://lore.kernel.org/all/383557d0fa1aa393dbab4e1daec94b6cced384ab.1730037261.git.leon@xxxxxxxxxx/ So clear benefits are: 1. Ability to use native for subsystem structure, e.g. bio for block, umem for RDMA, dmabuf for DRM, e.t.c. It removes current wasteful conversions from and to SG in order to work with DMA API. 2. Batched request and iotlb sync optimizations (perform only once). 3. Avoid very expensive call to pgmap pointer. 4. Expose MMIO over VFIO without hacks (PCI BAR doesn't have struct pages). See this series for such a hack https://lore.kernel.org/all/20250307052248.405803-1-vivek.kasireddy@xxxxxxxxx/ Thanks > > > Best regards > -- > Marek Szyprowski, PhD > Samsung R&D Institute Poland > >