On Tue, 2024-06-11 at 17:10 +0200, David Hildenbrand wrote: > > > > > > which checks mmap_assert_write_locked(). > > > > > > Setting VMA flags would be racy with the mmap lock in read mode. > > > > > > > > > remap_pfn_range() documents: "this is only safe if the mm semaphore is > > > held when called." which doesn't spell out if it needs to be held in > > > write mode (which I think it does) :) > > > > Logically this makes sense to me. At the same time it looks like > > fixup_user_fault() expects the caller to only hold mmap_read_lock() as > > I do here. In there it even retakes mmap_read_lock(). But then wouldn't > > any fault handling by its nature need to hold the write lock? > > Well, if you're calling remap_pfn_range() right now the expectation is > that we hold it in write mode. :) > > Staring at some random users, they all call it from mmap(), where you > hold the mmap lock in write mode. > > > I wonder why we are not seeing that splat with vfio all of the time? > > That mmap lock check was added "recently". In 1c71222e5f23 we started > using vm_flags_set(). That (including the mmap_assert_write_locked()) > check was added via bc292ab00f6c almost 1.5 years ago. > > Maybe vfio is a bit special and was never really run with lockdep? > > > > > > > > > > > > My best guess is: if you are using remap_pfn_range() from a fault > > > handler (not during mmap time) you are doing something wrong, that's why > > > you get that report. > > > > @Alex: I guess so far the vfio_pci_mmap_fault() handler is only ever > > triggered by "normal"/"actual" page faults where this isn't a problem? > > Or could it be a problem there too? > > > > I think we should see it there as well, unless I am missing something. > > > > > > > vmf_insert_pfn() and friends might be better alternatives, that make > > > sure that the VMA already received the proper VMA flags at mmap time. > > > > > > There would be ways of silencing that check: for example, making sure at > mmap time that these flags are already set, and skipping modifications > if the flags are already set. > > But, we'll run into more similar checks in x86 VM_PAT code, where we > would do vm_flags_set(vma, VM_PAT) from track_pfn_remap. Some of that > code really doesn't want to be called concurrently (e.g., "vma->vm_pgoff > = pfn;"). > > I thought that we silenced some of these warnings in the past using > __vm_flags_mod(). But it sounds hacky. > > CCing Sureen. > Before I forget it, thanks a lot for your incredible help David!