Re: [patch] mm, oom: prevent soft lockup on memcg oom for UP systems

David Rientjes <rientjes@xxxxxxxxxx> · Wed, 11 Mar 2020 12:38:07 -0700 (PDT)

On Wed, 11 Mar 2020, Tetsuo Handa wrote:

> >>> diff --git a/mm/vmscan.c b/mm/vmscan.c
> >>> --- a/mm/vmscan.c
> >>> +++ b/mm/vmscan.c
> >>> @@ -2637,6 +2637,8 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc)
> >>>  		unsigned long reclaimed;
> >>>  		unsigned long scanned;
> >>>  
> >>> +		cond_resched();
> >>> +
> >>
> >> Is this safe for CONFIG_PREEMPTION case? If current thread has realtime priority,
> >> can we guarantee that the OOM victim (well, the OOM reaper kernel thread rather
> >> than the OOM victim ?) gets scheduled?
> >>
> > 
> > I think it's the best we can do that immediately solves the issue unless 
> > you have another idea in mind?
> 
> "schedule_timeout_killable(1) outside of oom_lock" or "the OOM reaper grabs oom_lock
> so that allocating threads guarantee that the OOM reaper gets scheduled" or "direct OOM
> reaping so that allocating threads guarantee that some memory is reclaimed".
> 

The cond_resched() here is needed if the iteration is lengthy depending on 
the number of descendant memcgs already.

schedule_timeout_killable(1) does not make any guarantees that current 
will be scheduled after the victim or oom_reaper on UP systems.

If you have an alternate patch to try, we can test it.  But since this 
cond_resched() is needed anyway, I'm not sure it will change the result.

> > 
> >>>  		switch (mem_cgroup_protected(target_memcg, memcg)) {
> >>>  		case MEMCG_PROT_MIN:
> >>>  			/*
> >>>
> >>
> 
>