On Tue, 18 Feb 2025 17:47:46 -0500 Peter Xu <peterx@xxxxxxxxxx> wrote: > On Tue, Feb 18, 2025 at 03:22:06PM -0700, Alex Williamson wrote: > > vfio-pci supports huge_fault for PCI MMIO BARs and will insert pud and > > pmd mappings for well aligned mappings. follow_pfnmap_start() walks the > > page table and therefore knows the page mask of the level where the > > address is found and returns this through follow_pfnmap_args.pgmask. > > Subsequent pfns from this address until the end of the mapping page are > > necessarily consecutive. Use this information to retrieve a range of > > pfnmap pfns in a single pass. > > > > With optimal mappings and alignment on systems with 1GB pud and 4KB > > page size, this reduces iterations for DMA mapping PCI BARs by a > > factor of 256K. In real world testing, the overhead of iterating > > pfns for a VM DMA mapping a 32GB PCI BAR is reduced from ~1s to > > sub-millisecond overhead. > > > > Signed-off-by: Alex Williamson <alex.williamson@xxxxxxxxxx> > > --- > > drivers/vfio/vfio_iommu_type1.c | 23 ++++++++++++++++------- > > 1 file changed, 16 insertions(+), 7 deletions(-) > > > > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c > > index ce661f03f139..0ac56072af9f 100644 > > --- a/drivers/vfio/vfio_iommu_type1.c > > +++ b/drivers/vfio/vfio_iommu_type1.c > > @@ -520,7 +520,7 @@ static void vfio_batch_fini(struct vfio_batch *batch) > > > > static int follow_fault_pfn(struct vm_area_struct *vma, struct mm_struct *mm, > > unsigned long vaddr, unsigned long *pfn, > > - bool write_fault) > > + unsigned long *addr_mask, bool write_fault) > > { > > struct follow_pfnmap_args args = { .vma = vma, .address = vaddr }; > > int ret; > > @@ -544,10 +544,12 @@ static int follow_fault_pfn(struct vm_area_struct *vma, struct mm_struct *mm, > > return ret; > > } > > > > - if (write_fault && !args.writable) > > + if (write_fault && !args.writable) { > > ret = -EFAULT; > > - else > > + } else { > > *pfn = args.pfn; > > + *addr_mask = args.addr_mask; > > + } > > > > follow_pfnmap_end(&args); > > return ret; > > @@ -590,15 +592,22 @@ static long vaddr_get_pfns(struct mm_struct *mm, unsigned long vaddr, > > vma = vma_lookup(mm, vaddr); > > > > if (vma && vma->vm_flags & VM_PFNMAP) { > > - ret = follow_fault_pfn(vma, mm, vaddr, pfn, prot & IOMMU_WRITE); > > + unsigned long addr_mask; > > + > > + ret = follow_fault_pfn(vma, mm, vaddr, pfn, &addr_mask, > > + prot & IOMMU_WRITE); > > if (ret == -EAGAIN) > > goto retry; > > > > if (!ret) { > > - if (is_invalid_reserved_pfn(*pfn)) > > - ret = 1; > > - else > > + if (is_invalid_reserved_pfn(*pfn)) { > > + unsigned long epfn; > > + > > + epfn = (*pfn | (~addr_mask >> PAGE_SHIFT)) + 1; > > + ret = min_t(long, npages, epfn - *pfn); > > s/long/unsigned long/? ret is signed long since it's the function return and needs to be able to return -errno, so long was the intention here. Thanks, Alex > Reviewed-by: Peter Xu <peterx@xxxxxxxxxx> > > > + } else { > > ret = -EFAULT; > > + } > > } > > } > > done: > > -- > > 2.48.1 > > >