Hey Drew, On Wed, 22 May 2024 18:56:29 +0200 Andrew Jones <ajones@xxxxxxxxxxxxxxxx> wrote: > On Mon, May 08, 2023 at 08:58:42PM GMT, Yan Zhao wrote: > > In VFIO type1, vaddr_get_pfns() will try fault in MMIO PFNs after > > pin_user_pages_remote() returns -EFAULT. > > > > follow_fault_pfn > > fixup_user_fault > > handle_mm_fault > > handle_mm_fault > > do_fault > > do_shared_fault > > do_fault > > __do_fault > > vfio_pci_mmap_fault > > io_remap_pfn_range > > remap_pfn_range > > track_pfn_remap > > vm_flags_set ==> mmap_assert_write_locked(vma->vm_mm) > > remap_pfn_range_notrack > > vm_flags_set ==> mmap_assert_write_locked(vma->vm_mm) > > > > As io_remap_pfn_range() will call vm_flags_set() to update vm_flags [1], > > holding of mmap write lock is required. > > So, update vfio_pci_mmap_fault() to drop mmap read lock and take mmap > > write lock. > > > > [1] https://lkml.kernel.org/r/20230126193752.297968-3-surenb@xxxxxxxxxx > > commit bc292ab00f6c ("mm: introduce vma->vm_flags wrapper functions") > > commit 1c71222e5f23 > > ("mm: replace vma->vm_flags direct modifications with modifier calls") > > > > With linux-next I started noticing traces similar to the above without > lockdep, since it has ba168b52bf8e ("mm: use rwsem assertion macros for > mmap_lock"). Were there any follow ups to this? Sorry if my quick > searching missed it. We've been working on it in the background, but we're not there yet. I have a branch[1] that converts to an address space on the vfio device, making unmap_mapping_range() available to easily zap mappings to the BARs without all the ugly vma tracking, but that still leaves us with the problem that the remap_pfn_range() shouldn't be called from the fault handler resulting in this lockdep warning. We can switch to vmf_insert_pfn() in the fault handler, but that's a noticeable performance penalty. The angle I've been working recently is to try taking advantage of huge_fault support because we do have vmf_insert_pfn_{pmd,pud}, but these don't currently support pfnmap pfns. I've been working with Peter Xu as this aligns with some other work of his. It's working and resolves the performance issue, especially with a little alignment tweaking in userspace to take advantage of pud mappings. I'm not sure if there are any outstanding blockers on Peter's side, but this seems like a good route from the vfio side. If we're seeing this now without lockdep, we might need to bite the bullet and take the hit with vmf_insert_pfn() while the pmd/pud path learn about pfnmaps. Thanks, Alex [1]https://github.com/awilliam/linux-vfio/commits/vfio-address-space/