Re: [PATCH 2/2] vfio/pci: Use unmap_mapping_range()

Yan Zhao <yan.y.zhao@xxxxxxxxx> · Fri, 24 May 2024 08:39:37 +0800

On Thu, May 23, 2024 at 01:56:27PM -0600, Alex Williamson wrote:
> With the vfio device fd tied to the address space of the pseudo fs
> inode, we can use the mm to track all vmas that might be mmap'ing
> device BARs, which removes our vma_list and all the complicated lock
> ordering necessary to manually zap each related vma.
> 
> Note that we can no longer store the pfn in vm_pgoff if we want to use
> unmap_mapping_range() to zap a selective portion of the device fd
> corresponding to BAR mappings.
> 
> This also converts our mmap fault handler to use vmf_insert_pfn()
Looks vmf_insert_pfn() does not call memtype_reserve() to reserve memory type
for the PFN on x86 as what's done in io_remap_pfn_range().

Instead, it just calls lookup_memtype() and determine the final prot based on
the result from this lookup, which might not prevent others from reserving the
PFN to other memory types.

Does that matter?
> because we no longer have a vma_list to avoid the concurrency problem
> with io_remap_pfn_range().  The goal is to eventually use the vm_ops
> huge_fault handler to avoid the additional faulting overhead, but
> vmf_insert_pfn_{pmd,pud}() need to learn about pfnmaps first.
>
> Also, Jason notes that a race exists between unmap_mapping_range() and
> the fops mmap callback if we were to call io_remap_pfn_range() to
> populate the vma on mmap.  Specifically, mmap_region() does call_mmap()
> before it does vma_link_file() which gives a window where the vma is
> populated but invisible to unmap_mapping_range().
>