On Tue 14-07-20 09:57:14, Yafang Shao wrote: > Memcg oom killer invocation is synchronized by the global oom_lock and > tasks are sleeping on the lock while somebody is selecting the victim or > potentially race with the oom_reaper is releasing the victim's memory. > This can result in a pointless oom killer invocation because a waiter > might be racing with the oom_reaper > > P1 oom_reaper P2 > oom_reap_task mutex_lock(oom_lock) > out_of_memory # no victim because we have one already > __oom_reap_task_mm mute_unlock(oom_lock) > mutex_lock(oom_lock) > set MMF_OOM_SKIP > select_bad_process > # finds a new victim > > The page allocator prevents from this race by trying to allocate after > the lock can be acquired (in __alloc_pages_may_oom) which acts as a last > minute check. Moreover page allocator simply doesn't block on the > oom_lock and simply retries the whole reclaim process. > > Memcg oom killer should do the last minute check as well. Call > mem_cgroup_margin to do that. Trylock on the oom_lock could be done as > well but this doesn't seem to be necessary at this stage. > > [mhocko@xxxxxxxxxx: commit log] > Suggested-by: Michal Hocko <mhocko@xxxxxxxxxx> > Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx> > Cc: Michal Hocko <mhocko@xxxxxxxxxx> > Cc: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> > Cc: David Rientjes <rientjes@xxxxxxxxxx> > Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Acked-by: Michal Hocko <mhocko@xxxxxxxx> Thanks! > --- > v1 -> v2: > - commit log improved by Michal > - retitle the subject from "mm, oom: check memcg margin for parallel oom" > - code simplicity, per Michal > --- > mm/memcontrol.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 1962232..15e0e18 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1560,15 +1560,21 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask, > .gfp_mask = gfp_mask, > .order = order, > }; > - bool ret; > + bool ret = true; > > if (mutex_lock_killable(&oom_lock)) > return true; > + > + if (mem_cgroup_margin(memcg) >= (1 << order)) > + goto unlock; > + > /* > * A few threads which were not waiting at mutex_lock_killable() can > * fail to bail out. Therefore, check again after holding oom_lock. > */ > ret = should_force_charge() || out_of_memory(&oc); > + > +unlock: > mutex_unlock(&oom_lock); > return ret; > } > -- > 1.8.3.1 -- Michal Hocko SUSE Labs