On Mon, Jan 23, 2023 at 1:56 AM Michal Hocko <mhocko@xxxxxxxx> wrote: > > On Fri 20-01-23 09:50:01, Suren Baghdasaryan wrote: > > On Fri, Jan 20, 2023 at 9:32 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > [...] > > > The page fault handler (or whatever other reader -- ptrace, proc, etc) > > > should have a refcount on the mm_struct, so we can't be in this path > > > trying to free VMAs. Right? > > > > Hmm. That sounds right. I checked process_mrelease() as well, which > > operated on mm with only mmgrab()+mmap_read_lock() but it only unmaps > > VMAs without freeing them, so we are still good. Michal, do you agree > > this is ok? > > Don't we need RCU procetions for the vma life time assurance? Jann has > already shown how rwsem is not safe wrt to unlock and free without RCU. Jann's case requires a thread freeing the VMA to be blocked on vma write lock waiting for the vma real lock to be released by a page fault handler. However exit_mmap() means mm->mm_users==0, which in turn suggests that there are no racing page fault handlers and no new page fault handlers will appear. Is that a correct assumption? If so, then races with page fault handlers can't happen while in exit_mmap(). Any other path (other than page fault handlers), accesses vma->lock under protection of mmap_lock (for read or write, does not matter). One exception is when we operate on an isolated VMA, then we don't need mmap_lock protection, but exit_mmap() does not deal with isolated VMAs, so out of scope here. exit_mmap() frees vm_area_structs under protection of mmap_lock in write mode, so races with anything other than page fault handler should be safe as they are today. That said, the future possible users of lock_vma_under_rcu() using VMA without mmap_lock protection will have to ensure mm's stability while they are using the obtained VMA. IOW they should elevate mm's refcount and keep it elevated as long as they are using that VMA and not before vma->lock is released. I guess it would be a good idea to document that requirement in lock_vma_under_rcu() comments if we decide to take this route. > > -- > Michal Hocko > SUSE Labs