Re: [PATCH] mm, oom: don't invoke oom killer if current has been reapered

Michal Hocko <mhocko@xxxxxxxxxx> · Mon, 13 Jul 2020 08:21:32 +0200

On Mon 13-07-20 08:01:57, Michal Hocko wrote:
> On Fri 10-07-20 23:18:01, Yafang Shao wrote:
[...]
> > There're many threads of a multi-threaded task parallel running in a
> > container on many cpus. Then many threads triggered OOM at the same time,
> > 
> > CPU-1	        CPU-2         ...        CPU-n
> > thread-1        thread-2      ...        thread-n
> > 
> > wait oom_lock   wait oom_lock ...        hold oom_lock
> > 
> >                                          (sigkill received)
> > 
> >                                          select current as victim
> >                                          and wakeup oom reaper
> > 
> >                                          release oom_lock
> > 
> >                                          (MMF_OOM_SKIP set by oom reaper)
> > 
> >                                          (lots of pages are freed)
> > hold oom_lock
> 
> Could you be more specific please? The page allocator never waits for
> the oom_lock and keeps retrying instead. Also __alloc_pages_may_oom
> tries to allocate with the lock held.

I suspect that you are looking at memcg oom killer. Because we do not do
trylock there for some reason I do not immediatelly remember from top of
my head. If this is really the case then I would recommend looking into
how the page allocator implements this and follow the same pattern for
memcg as well.

-- 
Michal Hocko
SUSE Labs