Hi, Yan, On Fri, May 24, 2024 at 08:39:37AM +0800, Yan Zhao wrote: > On Thu, May 23, 2024 at 01:56:27PM -0600, Alex Williamson wrote: > > With the vfio device fd tied to the address space of the pseudo fs > > inode, we can use the mm to track all vmas that might be mmap'ing > > device BARs, which removes our vma_list and all the complicated lock > > ordering necessary to manually zap each related vma. > > > > Note that we can no longer store the pfn in vm_pgoff if we want to use > > unmap_mapping_range() to zap a selective portion of the device fd > > corresponding to BAR mappings. > > > > This also converts our mmap fault handler to use vmf_insert_pfn() > Looks vmf_insert_pfn() does not call memtype_reserve() to reserve memory type > for the PFN on x86 as what's done in io_remap_pfn_range(). > > Instead, it just calls lookup_memtype() and determine the final prot based on > the result from this lookup, which might not prevent others from reserving the > PFN to other memory types. I didn't worry too much on others reserving the same pfn range, as that should be the mmio region for this device, and this device should be owned by vfio driver. However I share the same question, see: https://lore.kernel.org/r/20240523223745.395337-2-peterx@xxxxxxxxxx So far I think it's not a major issue as VFIO always use UC- mem type, and that's also the default. But I do also feel like there's something we can do better, and I'll keep you copied too if I'll resend the series. Thanks, > > Does that matter? > > because we no longer have a vma_list to avoid the concurrency problem > > with io_remap_pfn_range(). The goal is to eventually use the vm_ops > > huge_fault handler to avoid the additional faulting overhead, but > > vmf_insert_pfn_{pmd,pud}() need to learn about pfnmaps first. > > > > Also, Jason notes that a race exists between unmap_mapping_range() and > > the fops mmap callback if we were to call io_remap_pfn_range() to > > populate the vma on mmap. Specifically, mmap_region() does call_mmap() > > before it does vma_link_file() which gives a window where the vma is > > populated but invisible to unmap_mapping_range(). > > > -- Peter Xu