Re: [patch 1/2] mm, memcg: avoid oom notification when current needs access to memory reserves

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 17 Dec 2013, Michal Hocko wrote:

> > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > index c72b03bf9679..fee25c5934d2 100644
> > > --- a/mm/memcontrol.c
> > > +++ b/mm/memcontrol.c
> > > @@ -2692,7 +2693,8 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
> > >  	 * MEMDIE process.
> > >  	 */
> > >  	if (unlikely(test_thread_flag(TIF_MEMDIE)
> > > -		     || fatal_signal_pending(current)))
> > > +		     || fatal_signal_pending(current))
> > > +		     || current->flags & PF_EXITING)
> > >  		goto bypass;
> > >  
> > >  	if (unlikely(task_in_memcg_oom(current)))
> > > 
> > > rather than the later checks down the oom_synchronize paths. The comment
> > > already mentions dying process...
> > > 
> > 
> > This is scary because it doesn't even try to reclaim memcg memory before 
> > allowing the allocation to succeed.
> 
> Why should it reclaim in the first place when it simply is on the way to
> release memory. In other words why should it increase the memory
> pressure when it is in fact releasing it?
> 

(Answering about removing the fatal_signal_pending() check as well here.)

For memory isolation, we'd only want to bypass memcg charges when 
absolutely necessary and it seems like TIF_MEMDIE is the only case where 
that's required.  We don't give processes with pending SIGKILLs or those 
in the exit() path access to memory reserves in the page allocator without 
first determining that reclaim can't make any progress for the same reason 
and then we only do so by setting TIF_MEMDIE when calling the oom killer.  

> I am really puzzled here. On one hand you are strongly arguing for not
> notifying when we know we can prevent from OOM action and on the other
> hand you are ok to get vmpressure/thresholds notification when an
> exiting task triggers reclaim.
> 
> So I am really lost in what you are trying to achieve here. It sounds a
> bit arbirtrary.
> 

It's not arbitrary to define when memcg bypass is allowed and, in my 
opinion, it should only be done in situations where it is unavoidable and 
therefore breaking memory isolation is required.

(We wouldn't expect a 128MB memcg to be oom [and perhaps with a userspace 
oom handler attached] when it has 100 children each 1MB in size just 
because they all happen to be oom at the same time.  We set up the excess 
memory in the parent specifically for the memcg with the oom handler 
attached.)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]