Re: [PATCH] mm, oom: show process exiting information in __oom_kill_process()

Michal Hocko <mhocko@xxxxxxxxxx> · Mon, 20 Jul 2020 15:41:21 +0200

On Mon 20-07-20 20:06:17, Tetsuo Handa wrote:
> On 2020/07/20 19:36, Yafang Shao wrote:
> > On Mon, Jul 20, 2020 at 3:16 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> >> I do agree that a silent bail out is not the best thing to do. The above
> >> message would be more useful if it also explained what the oom killer
> >> does (or does not):
> >>
> >>         "OOM victim %d (%s) is already exiting. Skip killing the task\n"
> >>
> > 
> > Sure.
> 
> This path is rarely hit because find_lock_task_mm() in oom_badness() from
> select_bad_process() in the next round of OOM killer will skip this task.

Agreed!

> Since we don't wake up the OOM reaper when hitting this path, unless __mmput()
> for this task itself immediately reclaims memory and updates the statistics
> counter, we just get two chunks of dump_header() messages and one OOM victim.
> 
> Current synchronous printk() gives __mmput() some time for reclaiming memory
> and updating the statistics counter. But when printk() becomes asynchronous,
> there might be quite small time. People might wonder "why killed message
> follows immediately after skipped killing message"... Wouldn't the skip
> message confuse people?

I would ask other way around. Wouldn't that give us a better clue that
the first oom invocation and the back off was a suboptimal decision? If
we learn about more of those, maybe we want to reconsider this heuristic
and rather retry the victim selection instead.

I do not really see how this message would be harmful TBH.

-- 
Michal Hocko
SUSE Labs