Re: [PATCH v2] memcg, oom: check memcg margin for parallel oom

Michal Hocko <mhocko@xxxxxxxxxx> · Tue, 14 Jul 2020 16:05:41 +0200

On Tue 14-07-20 09:57:14, Yafang Shao wrote:
> Memcg oom killer invocation is synchronized by the global oom_lock and
> tasks are sleeping on the lock while somebody is selecting the victim or
> potentially race with the oom_reaper is releasing the victim's memory.
> This can result in a pointless oom killer invocation because a waiter
> might be racing with the oom_reaper
> 
>         P1              oom_reaper              P2
>                         oom_reap_task           mutex_lock(oom_lock)
>                                                 out_of_memory # no victim because we have one already
>                         __oom_reap_task_mm      mute_unlock(oom_lock)
>  mutex_lock(oom_lock)
>                         set MMF_OOM_SKIP
>  select_bad_process
>  # finds a new victim
> 
> The page allocator prevents from this race by trying to allocate after
> the lock can be acquired (in __alloc_pages_may_oom) which acts as a last
> minute check. Moreover page allocator simply doesn't block on the
> oom_lock and simply retries the whole reclaim process.
> 
> Memcg oom killer should do the last minute check as well. Call
> mem_cgroup_margin to do that. Trylock on the oom_lock could be done as
> well but this doesn't seem to be necessary at this stage.
> 
> [mhocko@xxxxxxxxxx: commit log]
> Suggested-by: Michal Hocko <mhocko@xxxxxxxxxx>
> Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx>
> Cc: Michal Hocko <mhocko@xxxxxxxxxx>
> Cc: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> Cc: David Rientjes <rientjes@xxxxxxxxxx>
> Cc: Johannes Weiner <hannes@xxxxxxxxxxx>

Acked-by: Michal Hocko <mhocko@xxxxxxxx>

Thanks!

> ---
> v1 -> v2:
> - commit log improved by Michal
> - retitle the subject from "mm, oom: check memcg margin for parallel oom"
> - code simplicity, per Michal
> ---
>  mm/memcontrol.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 1962232..15e0e18 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1560,15 +1560,21 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  		.gfp_mask = gfp_mask,
>  		.order = order,
>  	};
> -	bool ret;
> +	bool ret = true;
>  
>  	if (mutex_lock_killable(&oom_lock))
>  		return true;
> +
> +	if (mem_cgroup_margin(memcg) >= (1 << order))
> +		goto unlock;
> +
>  	/*
>  	 * A few threads which were not waiting at mutex_lock_killable() can
>  	 * fail to bail out. Therefore, check again after holding oom_lock.
>  	 */
>  	ret = should_force_charge() || out_of_memory(&oc);
> +
> +unlock:
>  	mutex_unlock(&oom_lock);
>  	return ret;
>  }
> -- 
> 1.8.3.1

-- 
Michal Hocko
SUSE Labs