On Mon, Dec 07, 2020 at 09:48:48PM -0500, Daniel Jordan wrote: > Jason Gunthorpe <jgg@xxxxxxxx> writes: > > On Fri, Dec 04, 2020 at 03:05:46PM -0500, Daniel Jordan wrote: > >> Well Alex can correct me, but I went digging and a comment from the > >> first type1 vfio commit says the iommu API didn't promise to unmap > >> subpages of previous mappings, so doing page at a time gave flexibility > >> at the cost of inefficiency. > > > > iommu restrictions are not related to with gup. vfio needs to get the > > page list from the page tables as efficiently as possible, then you > > break it up into what you want to feed into the IOMMU how the iommu > > wants. > > > > vfio must maintain a page list to call unpin_user_pages() anyhow, so > > It does in some cases but not others, namely the expensive > VFIO_IOMMU_MAP_DMA/UNMAP_DMA path where the iommu page tables are used > to find the pfns when unpinning. Oh, I see.. Well, that is still possible, but vfio really needs to batch operations, eg call pin_user_pages() with some larger buffer and store those into the iommu and then reverse this to build up contiguous runs of pages to unpin > I don't see why vfio couldn't do as you say, though, and the worst case > memory overhead of using scatterlist to remember the pfns of a 300g VM > backed by huge but physically discontiguous pages is only a few meg, not > bad at all. Yes, but 0 is still better.. I would start by focusing on batching pin_user_pages. Jason