Re: [PATCH 1/2] mm: protect free_pgtables with mmap_lock write lock in exit_mmap

Michal Hocko <mhocko@xxxxxxxx> · Wed, 24 Nov 2021 13:20:35 +0100



On Tue 23-11-21 09:56:41, Suren Baghdasaryan wrote:
> On Tue, Nov 23, 2021 at 5:19 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> >
> > On Tue, Nov 16, 2021 at 01:57:14PM -0800, Suren Baghdasaryan wrote:
> > > @@ -3170,6 +3172,7 @@ void exit_mmap(struct mm_struct *mm)
> > >       unmap_vmas(&tlb, vma, 0, -1);
> > >       free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING);
> > >       tlb_finish_mmu(&tlb);
> > > +     mmap_write_unlock(mm);
> > >
> > >       /*
> > >        * Walk the list again, actually closing and freeing it,
> >
> > Is there a reason to unlock here instead of after the remove_vma loop?
> > We'll need the mmap sem held during that loop when VMAs are stored in
> > the maple tree.
> 
> I didn't realize remove_vma() would need to be protected as well. I
> think I can move mmap_write_unlock down to cover the last walk too
> with no impact.
> Does anyone know if there was any specific reason to perform that last
> walk with no locks held (as the comment states)? I can track that
> comment back to Linux-2.6.12-rc2 merge with no earlier history, so not
> sure if it's critical not to hold any locks at this point. Seems to me
> it's ok to hold mmap_write_unlock but maybe I'm missing something?

I suspect the primary reason was that neither fput (and callbacks
invoked from it) nor vm_close would need to be very careful about
interacting with mm locks. fput is async these days so it shouldn't be
problematic. vm_ops->close doesn't have any real contract definition AFAIK
but taking mmap_sem from those would be really suprising. They should be
mostly destructing internal vma state and that shouldn't really require
address space protection.
-- 
Michal Hocko
SUSE Labs