Re: How to handle TIF_MEMDIE stalls?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 19, 2014 at 09:22:49PM +0900, Tetsuo Handa wrote:
> (Renamed thread's title and invited Dave Chinner. A memory stressing program
> at http://marc.info/?l=linux-mm&m=141890469424353&w=2 can trigger stalls on
> a system with 4 CPUs/2048MB of RAM/no swap. I want to hear your opinion.)
> 
> Michal Hocko wrote:
> > > My question is quite simple. How can we avoid memory allocation stalls when
> > >
> > >   System has 2048MB of RAM and no swap.
> > >   Memcg1 for task1 has quota 512MB and 400MB in use.
> > >   Memcg2 for task2 has quota 512MB and 400MB in use.
> > >   Memcg3 for task3 has quota 512MB and 400MB in use.
> > >   Memcg4 for task4 has quota 512MB and 400MB in use.
> > >   Memcg5 for task5 has quota 512MB and 1MB in use.
> > >
> > > and task5 launches below memory consumption program which would trigger
> > > the global OOM killer before triggering the memcg OOM killer?
> > >
> > [...]
> > > The global OOM killer will try to kill this program because this program
> > > will be using 400MB+ of RAM by the time the global OOM killer is triggered.
> > > But sometimes this program cannot be terminated by the global OOM killer
> > > due to XFS lock dependency.
> > >
> > > You can see what is happening from OOM traces after uptime > 320 seconds of
> > > http://I-love.SAKURA.ne.jp/tmp/serial-20141213.txt.xz though memcg is not
> > > configured on this program.
> >
> > This is clearly a separate issue. It is a lock dependency and that alone
> > _cannot_ be handled from OOM killer as it doesn't understand lock
> > dependencies. This should be addressed from the xfs point of view IMHO
> > but I am not familiar with this filesystem to tell you how or whether it
> > is possible.

What XFS lock dependency? I see nothing in that output file that indicates a
lock dependency problem - can you point out what the issue is here?

> Then, let's ask Dave Chinner whether he can address it. My opinion is that
> everybody is doing __GFP_WAIT memory allocation without understanding the
> entire dependencies. Everybody is only prepared for allocation failures
> because everybody is expecting that the OOM killer shall somehow solve the
> OOM condition (except that some are expecting that memory stress that will
> trigger the OOM killer must not be given). I am neither familiar with XFS,
> but I don't think this issue can be addressed from the XFS point of view.

Well, I can't comment (nor am I going to waste time speculating)
until someone actually explains the XFS lock dependency that is
apparently causing reclaim problems.

Has lockdep reported any problems?

Cheers,

Dave.
-- 
Dave Chinner
dchinner@xxxxxxxxxx

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]