Re: [PATCH 3/8] mm,oom: Fix unnecessary killing of additional processes.

Michal Hocko <mhocko@xxxxxxxxxx> · Tue, 3 Jul 2018 16:58:08 +0200

On Tue 03-07-18 23:25:04, Tetsuo Handa wrote:
> David Rientjes is complaining that memcg OOM events needlessly select
> more OOM victims when the OOM reaper was unable to reclaim memory. This
> is because exit_mmap() is setting MMF_OOM_SKIP before calling
> free_pgtables(). While David is trying to introduce timeout based hold
> off approach, Michal Hocko is rejecting plain timeout based approaches.
> 
> Therefore, this patch gets rid of the OOM reaper kernel thread and
> introduces OOM-badness-score based hold off approach. The reason for
> getting rid of the OOM reaper kernel thread is explained below.
> 
>     We are about to start getting a lot of OOM victim processes by
>     introducing "mm, oom: cgroup-aware OOM killer" patchset.
> 
>     When there are multiple OOM victims, we should try reclaiming memory
>     in parallel rather than sequentially wait until conditions to be able
>     to reclaim memory are met. Also, we should preferentially reclaim from
>     OOM domains where currently allocating threads belong to. Also, we
>     want to get rid of schedule_timeout_*(1) heuristic which is used for
>     trying to give CPU resource to the owner of oom_lock. Therefire,
>     direct OOM reaping by allocating threads can do the job better than
>     the OOM reaper kernel thread.
> 
> This patch changes the OOM killer to wait until either __mmput()
> completes or OOM badness score did not decrease for 3 seconds.

So this is yet another timeout based thing... I am really getting tired
of this coming back and forth. I will get ignored most probably again,
but let me repeat. Once you make this timeout based you will really have
to make it tunable by userspace because one timeout never fits all
needs. And that would be quite stupid. Because what we have now is a
feedback based approach. So we retry as long as we can make progress and
fail eventually because we cannot retry for ever. How many times we
retry is an implementation detail so we do not have to expose that.

Anyway, You failed to explain whether there is any fundamental problem
that the current approach has and won't be able to handle or this new
approach would handle much better. So what are sound reasons to rewrite
the whole thing?

Why do you think that pulling the memory reaping into the oom context is
any better?

So color me unconvinced
-- 
Michal Hocko
SUSE Labs