On Wed, Dec 18, 2024 at 01:53:17PM -0800, Suren Baghdasaryan wrote: > Ah, ok I see now. I completely misunderstood what for_each_vma_range() > was doing. > > Then I think vma_start_write() should remain inside > vms_gather_munmap_vmas() and all vmas in mas_detach should be No, it must not. You really are not modifying anything yet (except the split, which we've already noted mark write themselves). > write-locked, even the ones we are not modifying. Otherwise what would > prevent the race I mentioned before? > > __mmap_region > __mmap_prepare > vms_gather_munmap_vmas // adds vmas to be unmapped into mas_detach, > // some locked > by __split_vma(), some not locked > > lock_vma_under_rcu() > vma = mas_walk // finds > unlocked vma also in mas_detach > vma_start_read(vma) // > succeeds since vma is not locked > // vma->detached, vm_start, > vm_end checks pass > // vma is successfully read-locked > > vms_clean_up_area(mas_detach) > vms_clear_ptes > // steps on a cleared PTE So here we have the added complexity that the vma is not unhooked at all. Is there anything that would prevent a concurrent gup_fast() from doing the same -- touch a cleared PTE? AFAICT two threads, one doing overlapping mmap() and the other doing gup_fast() can result in exactly this scenario. If we don't care about the GUP case, when I'm thinking we should not care about the lockless RCU case either. > __mmap_new_vma > vma_set_range // installs new vma in the range > __mmap_complete > vms_complete_munmap_vmas // vmas are write-locked and detached > but it's too late But at this point that old vma really is unhooked, and the vma_write_start() here will ensure readers are gone and it will clear PTEs *again*.