On Wed, Mar 06, 2024 at 11:43:28AM -0400, Jason Gunthorpe wrote: > I don't think they are so fundamentally different, at least in our > past conversations I never came out with the idea we should burden the > driver with two different flows based on what kind of alignment the > transfer happens to have. Then we talked past each other.. > At least the RDMA drivers could productively use just a page aligned > interface. But I didn't think this would make BIO users happy so never > even thought about it.. page aligned is generally the right thing for the block layer. NVMe for example already requires that anyway due to PRPs. > > The total transfer size should just be passed in by the callers and > > be known, and there should be no offset. > > The API needs the caller to figure out the total number of IOVA pages > it needs, rounding up the CPU ranges to full aligned pages. That > becomes the IOVA allocation. Yes, it's a basic align up to the granularity asuming we don't bother with non-aligned transfers. > > > So if we want to efficiently be able to handle these cases we need > > two APIs in the driver and a good framework to switch between them. > > But, what does the non-page-aligned version look like? Doesn't it > still look basically like this? I'd just rather have the non-aligned case for those who really need it be the loop over map single region that is needed for the direct mapping anyway. > > And what is the actual difference if the input is aligned? The caller > can assume it doesn't need to provide a per-range dma_addr_t during > unmap. A per-range dma_addr_t doesn't really make sense for the aligned and coalesced case. > It still can't assume the HW programming will be linear due to the P2P > !ACS support. > > And it still has to call an API per-cpu range to actually program the > IOMMU. > > So are they really so different to want different APIs? That strikes > me as a big driver cost. To not have to store a dma_address range per CPU range that doesn't actually get used at all.