On Wed, 25 Jul 2018, Tetsuo Handa wrote: > >> You might worry about situations where __oom_reap_task_mm() is a no-op. > >> But that is not always true. There is no point with emitting > >> > >> pr_info("oom_reaper: unable to reap pid:%d (%s)\n", ...); > >> debug_show_all_locks(); > >> > >> noise and doing > >> > >> set_bit(MMF_OOM_SKIP, &mm->flags); > >> > >> because exit_mmap() will not release oom_lock until __oom_reap_task_mm() > >> completes. That is, except extra noise, there is no difference with > >> current behavior which sets set_bit(MMF_OOM_SKIP, &mm->flags) after > >> returning from __oom_reap_task_mm(). > >> > > > > v5 has restructured how exit_mmap() serializes its unmapping with the oom > > reaper. It sets MMF_OOM_SKIP while holding mm->mmap_sem. > > > > I think that v5 is still wrong. exit_mmap() keeps mmap_sem held for write does > not prevent oom_reap_task() from emitting the noise and setting MMF_OOM_SKIP > after timeout. Since your purpose is to wait for release of memory which could > not be reclaimed by __oom_reap_task_mm(), what if __oom_reap_task_mm() was no-op and > exit_mmap() was preempted immediately after returning from __oom_reap_task_mm() ? > If exit_mmap() gets preempted indefinitely before it can free any memory, we are better off oom killing another process. The purpose of the timeout is to give an oom victim an amount of time to free its memory and exit before selecting another victim.