On Tue 15-02-22 12:19:22, Suren Baghdasaryan wrote: > After exit_mmap frees all vmas in the mm, mm->mmap needs to be reset, > otherwise it points to a vma that was freed and when reused leads to > a use-after-free bug. OK, so I have dived into this again. exit_mmap doesn't reset mmap indeed. That doesn't really matter for _oom victims_. Both the oom reaper and mrelease do check for MMF_OOM_SKIP before calling __oom_reap_task_mm. exit_mmap still sets MMF_OOM_SKIP before taking the mmap_lock for oom victims so those paths should be still properly synchronized. I have proposed to get rid of this http://lkml.kernel.org/r/YbHIaq9a0CtqRulE@xxxxxxxxxxxxxx but we haven't agreed on that. mrelease path is broken because it doesn't mark the process oom_victim and so the MMF_OOM_SKIP synchronization doesn't work. So we really need this. I would propose to rephrase the changelog to be more specific because I do not want to remember all those details later on. What about " oom reaping (__oom_reap_task_mm) relies on a 2 way synchronization with exit_mmap. First it relies on the mmap_lock to exclude from unlock path[1], page tables tear down (free_pgtables) and vma destruction. This alone is not sufficient because mm->mmap is never reset. For historical reasons[2] the lock is taken there is also MMF_OOM_SKIP set for oom victims before. The oom reaper only ever looks at oom victims so the whole scheme works properly but process_mrelease can opearate on any task (with fatal signals pending) which doesn't really imply oom victims. That means that the MMF_OOM_SKIP part of the synchronization doesn't work and it can see a task after the whole address space has been demolished and traverse an already released mm->mmap list. This leads to use after free as properly caught up by KASAN report. Fix the issue by reseting mm->mmap so that MMF_OOM_SKIP synchronization is not needed anymore. The MMF_OOM_SKIP is not removed from exit_mmap yet but it acts mostly as an optimization now. [1] 27ae357fa82b ("mm, oom: fix concurrent munlock and oom reaper unmap, v3") [2] 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently") " I really have to say that I hate how complex this has grown in the name of optimizations. This has backfired several times already resulting in 2 security issues. I really hope to get read any note of the oom reaper from exit_mmap. > Reported-by: syzbot+2ccf63a4bd07cf39cab0@xxxxxxxxxxxxxxxxxxxxxxxxx > Suggested-by: Michal Hocko <mhocko@xxxxxxxx> > Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx> Acked-by: Michal Hocko <mhocko@xxxxxxxx> Thanks! > --- > mm/mmap.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/mmap.c b/mm/mmap.c > index 1e8fdb0b51ed..d445c1b9d606 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -3186,6 +3186,7 @@ void exit_mmap(struct mm_struct *mm) > vma = remove_vma(vma); > cond_resched(); > } > + mm->mmap = NULL; > mmap_write_unlock(mm); > vm_unacct_memory(nr_accounted); > } > -- > 2.35.1.265.g69c8d7142f-goog -- Michal Hocko SUSE Labs