Any support for this or should we just go with the v2 series[1] by itself for v6.10? Thanks, Alex [1]https://lore.kernel.org/all/20240530045236.1005864-1-alex.williamson@xxxxxxxxxx/ On Thu, 6 Jun 2024 21:52:07 -0600 Alex Williamson <alex.williamson@xxxxxxxxxx> wrote: > In order to improve performance of typical scenarios we can try to insert > the entire vma on fault. This accelerates typical cases, such as when > the MMIO region is DMA mapped by QEMU. The vfio_iommu_type1 driver will > fault in the entire DMA mapped range through fixup_user_fault(). > > In synthetic testing, this improves the time required to walk a PCI BAR > mapping from userspace by roughly 1/3rd. > > This is likely an interim solution until vmf_insert_pfn_{pmd,pud}() gain > support for pfnmaps. > > Suggested-by: Yan Zhao <yan.y.zhao@xxxxxxxxx> > Link: https://lore.kernel.org/all/Zl6XdUkt%2FzMMGOLF@xxxxxxxxxxxxxxxxxxxxxxxxx/ > Signed-off-by: Alex Williamson <alex.williamson@xxxxxxxxxx> > --- > > I'm sending this as a follow-on patch to the v2 series[1] because this > is largely a performance optimization, and one that we may want to > revert when we can introduce huge_fault support. In the meantime, I > can't argue with the 1/3rd performance improvement this provides to > reduce the overall impact of the series below. Without objection I'd > therefore target this for v6.10 as well. Thanks, > > Alex > > [1]https://lore.kernel.org/all/20240530045236.1005864-1-alex.williamson@xxxxxxxxxx/ > > drivers/vfio/pci/vfio_pci_core.c | 19 +++++++++++++++++-- > 1 file changed, 17 insertions(+), 2 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c > index db31c27bf78b..987c7921affa 100644 > --- a/drivers/vfio/pci/vfio_pci_core.c > +++ b/drivers/vfio/pci/vfio_pci_core.c > @@ -1662,6 +1662,7 @@ static vm_fault_t vfio_pci_mmap_fault(struct vm_fault *vmf) > struct vm_area_struct *vma = vmf->vma; > struct vfio_pci_core_device *vdev = vma->vm_private_data; > unsigned long pfn, pgoff = vmf->pgoff - vma->vm_pgoff; > + unsigned long addr = vma->vm_start; > vm_fault_t ret = VM_FAULT_SIGBUS; > > pfn = vma_to_pfn(vma); > @@ -1669,11 +1670,25 @@ static vm_fault_t vfio_pci_mmap_fault(struct vm_fault *vmf) > down_read(&vdev->memory_lock); > > if (vdev->pm_runtime_engaged || !__vfio_pci_memory_enabled(vdev)) > - goto out_disabled; > + goto out_unlock; > > ret = vmf_insert_pfn(vma, vmf->address, pfn + pgoff); > + if (ret & VM_FAULT_ERROR) > + goto out_unlock; > > -out_disabled: > + /* > + * Pre-fault the remainder of the vma, abort further insertions and > + * supress error if fault is encountered during pre-fault. > + */ > + for (; addr < vma->vm_end; addr += PAGE_SIZE, pfn++) { > + if (addr == vmf->address) > + continue; > + > + if (vmf_insert_pfn(vma, addr, pfn) & VM_FAULT_ERROR) > + break; > + } > + > +out_unlock: > up_read(&vdev->memory_lock); > > return ret;