On Thu, Oct 01, 2020 at 01:51:33AM +0200, Jann Horn wrote: > On Thu, Oct 1, 2020 at 1:26 AM Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > On Wed, Sep 30, 2020 at 10:14:57PM +0200, Jann Horn wrote: > > > On Wed, Sep 30, 2020 at 2:50 PM Jann Horn <jannh@xxxxxxxxxx> wrote: > > > > On Wed, Sep 30, 2020 at 2:30 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > > > > On Tue, Sep 29, 2020 at 06:20:00PM -0700, Jann Horn wrote: > > > > > > In preparation for adding a mmap_assert_locked() check in > > > > > > __get_user_pages(), teach the mmap_assert_*locked() helpers that it's fine > > > > > > to operate on an mm without locking in the middle of execve() as long as > > > > > > it hasn't been installed on a process yet. > > > > > > > > > > I'm happy to see lockdep being added here, but can you elaborate on > > > > > why add this mmap_locked_required instead of obtaining the lock in the > > > > > execv path? > > > > > > > > My thinking was: At that point, we're logically still in the > > > > single-owner initialization phase of the mm_struct. Almost any object > > > > has initialization and teardown steps that occur in a context where > > > > the object only has a single owner, and therefore no locking is > > > > required. It seems to me that adding locking in places like > > > > get_arg_page() would be confusing because it would suggest the > > > > existence of concurrency where there is no actual concurrency, and it > > > > might be annoying in terms of lockdep if someone tries to use > > > > something like get_arg_page() while holding the mmap_sem of the > > > > calling process. It would also mean that we'd be doing extra locking > > > > in normal kernel builds that isn't actually logically required. > > > > > > > > Hmm, on the other hand, dup_mmap() already locks the child mm (with > > > > mmap_write_lock_nested()), so I guess it wouldn't be too bad to also > > > > do it in get_arg_page() and tomoyo_dump_page(), with comments that > > > > note that we're doing this for lockdep consistency... I guess I can go > > > > change this in v2. > > > > > > Actually, I'm taking that back. There's an extra problem: > > > get_arg_page() accesses bprm->vma, which is set all the way back in > > > __bprm_mm_init(). We really shouldn't be pretending that we're > > > properly taking the mmap_sem when actually, we keep reusing a > > > vm_area_struct pointer. > > > > Any chance the mmap lock can just be held from mm_struct allocation > > till exec inserts it into the process? > > Hm... it should work if we define a lockdep subclass for this so that > lockdep is happy when we call get_user() on the old mm_struct while > holding that mmap lock. A subclass isn't right, it has to be a _nested annotation. nested locking is a pretty good reason to not be able to do this, this is something lockdep does struggle to model. Jason