On Wed, Aug 4, 2021 at 5:21 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote: > On Wed, Aug 04, 2021 at 04:42:23PM +0200, Jann Horn wrote: > > Since I haven't sent a new version of my old series for almost a year, > > I think it'd be fine to take Luigi's patch for now, and undo it at a > > later point when/if we want to actually use proper locking here > > because we're worried about concurrent access to the MM. > > IIRC one of the major points of that work was not "proper locking" but > to have enough locking to be complatible with lockdep so we could add > assertions like in get_user_pages and find_vma. That's part of it; but it's also for making the code more clearly correct and future-proofing it. Looking at it now, I think process_madvise() might actually already be able to race with execve() to some degree; and if you made a change like this to the current kernel: diff --git a/mm/madvise.c b/mm/madvise.c index 6d3d348b17f4..3648c198673c 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1043,12 +1043,14 @@ madvise_behavior_valid(int behavior) static bool process_madvise_behavior_valid(int behavior) { switch (behavior) { case MADV_COLD: case MADV_PAGEOUT: + case MADV_DOFORK: + case MADV_DONTFORK: return true; default: return false; } } it would probably introduce a memory corruption bug, because then someone might be able to destroy the stack VMA between setup_new_exec() and setup_arg_pages() by using process_madvise() to trigger VMA splitting/merging in the right pattern.