Re: [patch 2/2] memcg: do not sleep on OOM waitqueue with full charge context

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 6 Jun 2013, Johannes Weiner wrote:

> > Could you point me to those bug reports?  As far as I know, we have never 
> > encountered them so it would be surprising to me that we're running with a 
> > potential landmine and have seemingly never hit it.
> 
> Sure thing: https://lkml.org/lkml/2012/11/21/497
> 

Ok, I think I read most of it, although the lkml.org interface makes it 
easy to miss some.

> During that thread Michal pinned down the problem to i_mutex being
> held by the OOM invoking task, which the selected victim is trying to
> acquire.
> 
> > > > > Reported-by: Reported-by: azurIt <azurit@xxxxxxxx>

Ok, so the key here is that azurIt was able to reliably reproduce this 
issue and now it has been resurrected after seven months of silence since 
that thread.  I also notice that azurIt isn't cc'd on this thread.  Do we 
know if this is still a problem?

We certainly haven't run into any memcg deadlocks like this.

> > It certainly would, but it's not the point that memory.oom_delay_millisecs 
> > was intended to address.  memory.oom_delay_millisecs would simply delay 
> > calling mem_cgroup_out_of_memory() unless userspace can't free memory or 
> > increase the memory limit in time.  Obviously that delay isn't going to 
> > magically address any lock dependency issues.
> 
> The delayed fallback would certainly resolve the issue of the
> userspace handler getting stuck, be it due to memory shortness or due
> to locks.
> 
> However, it would not solve the part of the problem where the OOM
> killing kernel task is holding locks that the victim requires to exit.
> 

Right.

> We are definitely looking at multiple related issues, that's why I'm
> trying to fix them step by step.
> 

I guess my question is why this would be addressed now when nobody has 
reported it recently on any recent kernel and then not cc the person who 
reported it?

Can anybody, even with an instrumented kernel to make it more probable, 
reproduce the issue this is addressing?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]