On Mon 13-07-20 08:01:57, Michal Hocko wrote: > On Fri 10-07-20 23:18:01, Yafang Shao wrote: [...] > > There're many threads of a multi-threaded task parallel running in a > > container on many cpus. Then many threads triggered OOM at the same time, > > > > CPU-1 CPU-2 ... CPU-n > > thread-1 thread-2 ... thread-n > > > > wait oom_lock wait oom_lock ... hold oom_lock > > > > (sigkill received) > > > > select current as victim > > and wakeup oom reaper > > > > release oom_lock > > > > (MMF_OOM_SKIP set by oom reaper) > > > > (lots of pages are freed) > > hold oom_lock > > Could you be more specific please? The page allocator never waits for > the oom_lock and keeps retrying instead. Also __alloc_pages_may_oom > tries to allocate with the lock held. I suspect that you are looking at memcg oom killer. Because we do not do trylock there for some reason I do not immediatelly remember from top of my head. If this is really the case then I would recommend looking into how the page allocator implements this and follow the same pattern for memcg as well. -- Michal Hocko SUSE Labs