On Fri, May 01, 2020 at 03:39:19PM -0600, Alex Williamson wrote: > Rather than calling remap_pfn_range() when a region is mmap'd, setup > a vm_ops handler to support dynamic faulting of the range on access. > This allows us to manage a list of vmas actively mapping the area that > we can later use to invalidate those mappings. The open callback > invalidates the vma range so that all tracking is inserted in the > fault handler and removed in the close handler. > > Signed-off-by: Alex Williamson <alex.williamson@xxxxxxxxxx> > --- > drivers/vfio/pci/vfio_pci.c | 76 ++++++++++++++++++++++++++++++++++- > drivers/vfio/pci/vfio_pci_private.h | 7 +++ > 2 files changed, 81 insertions(+), 2 deletions(-) > +static vm_fault_t vfio_pci_mmap_fault(struct vm_fault *vmf) > +{ > + struct vm_area_struct *vma = vmf->vma; > + struct vfio_pci_device *vdev = vma->vm_private_data; > + > + if (vfio_pci_add_vma(vdev, vma)) > + return VM_FAULT_OOM; > + > + if (remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff, > + vma->vm_end - vma->vm_start, vma->vm_page_prot)) > + return VM_FAULT_SIGBUS; > + > + return VM_FAULT_NOPAGE; > +} > + > +static const struct vm_operations_struct vfio_pci_mmap_ops = { > + .open = vfio_pci_mmap_open, > + .close = vfio_pci_mmap_close, > + .fault = vfio_pci_mmap_fault, > +}; > + > static int vfio_pci_mmap(void *device_data, struct vm_area_struct *vma) > { > struct vfio_pci_device *vdev = device_data; > @@ -1357,8 +1421,14 @@ static int vfio_pci_mmap(void *device_data, struct vm_area_struct *vma) > vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); > vma->vm_pgoff = (pci_resource_start(pdev, index) >> PAGE_SHIFT) + pgoff; > > - return remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff, > - req_len, vma->vm_page_prot); > + /* > + * See remap_pfn_range(), called from vfio_pci_fault() but we can't > + * change vm_flags within the fault handler. Set them now. > + */ > + vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP; > + vma->vm_ops = &vfio_pci_mmap_ops; Perhaps do the vfio_pci_add_vma & remap_pfn_range combo here if the BAR is activated ? That way a fully populated BAR is presented in the common case and avoids taking a fault path? But it does seem OK as is Jason