Re: [BUG] fatal hang untarring 90GB file, possibly writeback related.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 28, 2011 at 02:59:27PM -0500, James Bottomley wrote:
> On Thu, 2011-04-28 at 20:21 +0100, Mel Gorman wrote:
> > On Thu, Apr 28, 2011 at 01:30:36PM -0500, James Bottomley wrote:
> > > > Way hey, cgroups are also in the mix. How jolly.
> > > > 
> > > > Is systemd a common element of the machines hitting this bug by any
> > > > chance?
> > > 
> > > Well, yes, the bug report is against FC15, which needs cgroups for
> > > systemd.
> > > 
> > 
> > Ok although we do not have direct evidence that it's the problem yet. A
> > broken shrinker could just mean we are also trying to aggressively
> > reclaim in cgroups.
> > 
> > > > The remaining traces seem to be follow-on damage related to the three
> > > > issues of "shrinkers are bust in some manner" causing "we are not
> > > > getting over the min watermark" and as a side-show "we are spending lots
> > > > of time doing something unspecified but unhelpful in cgroups".
> > > 
> > > Heh, well find a way for me to verify this: I can't turn off cgroups
> > > because systemd then won't work and the machine won't boot ...
> > > 
> > 
> > Same testcase, same kernel but a distro that is not using systemd to
> > verify if cgroups are the problem. Not ideal I know. When I'm back
> > online Tuesday, I'll try reproducing this on a !Fedora distribution. In
> > the meantime, the following untested hatchet job might spit out
> > which shrinker we are getting stuck in. It is also breaking out of
> > the shrink_slab loop so it'd even be interesting to see if the bug
> > is mitigated in any way.
> 
> Actually, talking to Chris, I think I can get the system up using
> init=/bin/bash without systemd, so I can try the no cgroup config.
> 
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index c74a501..ed99104 100644
> 
> In the mean time, this patch produces:
> 
> (that's nothing ... apparently the trace doesn't activate when kswapd
> goes mad).
> 

Or is looping there for shorter than we expect. HZ/10?

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]