On Thu, 2011-05-12 at 09:43 -0500, Christoph Lameter wrote: > On Wed, 11 May 2011, Mel Gorman wrote: > > > --- a/mm/slub.c > > +++ b/mm/slub.c > > @@ -2198,7 +2198,7 @@ EXPORT_SYMBOL(kmem_cache_free); > > * take the list_lock. > > */ > > static int slub_min_order; > > -static int slub_max_order = PAGE_ALLOC_COSTLY_ORDER; > > +static int slub_max_order; > > If we really need to do this then do not push this down to zero please. > SLAB uses order 1 for the meax. Lets at least keep it theere. 1 is the current value. Reducing it to zero seems to fix the kswapd induced hangs. The problem does look to be some shrinker/allocator interference somewhere in vmscan.c, but the fact is that it's triggered by SLUB and not SLAB. I really think that what's happening is some type of feedback loops where one of the shrinkers is issuing a wakeup_kswapd() so kswapd never sleeps (and never relinquishes the CPU on non-preempt). > We have been using SLUB for a long time. Why is this issue arising now? > Due to compaction etc making reclaim less efficient? This is the snark argument (I've said it thrice the bellman cried and what I tell you three times is true). The fact is that no enterprise distribution at all uses SLUB. It's only recently that the desktop distributions started to ... the bugs are showing up under FC15 beta, which is the first fedora distribution to enable it. I'd say we're only just beginning widespread SLUB testing. James -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html