On 2022-01-11 2:25 p.m., Matthew Wilcox wrote: > That's reproducing the bad decision of the scatterlist, only with > a different encoding. You end up with something like: > > struct neoscat { > dma_addr_t dma_addr; > phys_addr_t phys_addr; > size_t dma_len; > size_t phys_len; > }; > > and the dma_addr and dma_len are unused by all-but-the-first entry when > you have a competent IOMMU. We want a different data structure in and > out, and we may as well keep using the scatterlist for the dma-map-out. With my P2PDMA patchset, even with a competent IOMMU, we need to support multiple dma_addr/dma_len pairs (plus the flag bit). This is required to program IOVAs and multiple bus addresses into a single DMA transactions. I think using the scatter list for the DMA-out side is not ideal seeing we don't need the page pointers or multiple length fields and we won't be able to change the sgl substantially given the massive amount of existing use cases that won't go away over night. My hope would be along these lines: struct phy_range { phys_addr_t phyr_addr; u32 phyr_len; u32 phyr_flags; }; struct dma_range { dma_addr_t dmar_addr; u32 dmar_len; u32 dmar_flags; }; A new GUP helper would somehow return a list of phy_range structs and the new dma_map function would take that list and return a list of dma_range structs. Each element in the list could represent a segment up to 4GB, so any range longer than that would need multiple items in the list. (Alternatively, perhaps the length could be a 64bit value and we steal some of the top bits for flags or some such). The flags would not only be needed by some of the use cases mentioned (FOLL_PIN or DMA_BUS_ADDRESS) but could also support chaining these lists like SGLs so continuous vmallocs would not be necessary for longer lists. If there's an [phy|dma]_range_list struct (or some such) which contains these range structs (per some details of Jason's suggestions) that'd be fine by me too and would just depend on implementation details. However, the main problem I see is a chicken and egg problem. The new dma_map function would need to be implemented by every dma_map provider or any driver trying to use it would need a messy fallback. Either that, or we need a wrapper that allocates an appropriately sized SGL to pass to any dma_map implementation that doesn't support the new structures. Logan