Re: [patch] mm, oom: prevent soft lockup on memcg oom for UP systems

Michal Hocko <mhocko@xxxxxxxxxx> · Thu, 12 Mar 2020 21:16:24 +0100

On Thu 12-03-20 11:20:33, David Rientjes wrote:
> On Thu, 12 Mar 2020, Michal Hocko wrote:
> 
> > > I think the changelog clearly states that we need to guarantee that a 
> > > reclaimer will yield the processor back to allow a victim to exit.  This 
> > > is where we make the guarantee.  If it helps for the specific reason it 
> > > triggered in my testing, we could add:
> > > 
> > > "For example, mem_cgroup_protected() can prohibit reclaim and thus any 
> > > yielding in page reclaim would not address the issue."
> > 
> > I would suggest something like the following:
> > "
> > The reclaim path (including the OOM) relies on explicit scheduling
> > points to hand over execution to tasks which could help with the reclaim
> > process.
> 
> Are there other examples where yielding in the reclaim path would "help 
> with the reclaim process" other than oom victims?  This sentence seems 
> vague.

In the context of UP and !PREEMPT this also includes IO flushers,
filesystems rely on workers and there are things I am very likely not
aware of. If you think this is vaague then feel free to reformulate.
All I really do care about is what the next paragraph is explaining.

> > Currently it is mostly shrink_page_list which yields CPU for
> > each reclaimed page. This might be insuficient though in some
> > configurations. E.g. when a memcg OOM path is triggered in a hierarchy
> > which doesn't have any reclaimable memory because of memory reclaim
> > protection (MEMCG_PROT_MIN) then there is possible to trigger a soft
> > lockup during an out of memory situation on non preemptible kernels
> > <PUT YOUR SOFT LOCKUP SPLAT HERE>
> > 
> > Fix this by adding a cond_resched up in the reclaim path and make sure
> > there is a yield point regardless of reclaimability of the target
> > hierarchy.
> > "
> > 

-- 
Michal Hocko
SUSE Labs