On Fri, Aug 20, 2010 at 01:34:47PM +0800, Wu Fengguang wrote: > You do run lots of tasks: kernel_stack=1880kB. > > And you have lots of free memory, page reclaim has never run, so > inactive_anon=0. This is where compaction is different from vmscan. > In vmscan, inactive_anon is reasonably large, and will only be > compared directly with isolated_anon. > True, the key observation here was that compaction is being run via the proc trigger. Normally it would be run as part of the direct reclaim path when kswapd would already be awake. too_many_isolated() needs to be different for compaction to take the whole system into account. What would be the best alternative? Here is one possibility. A reasonable alternative would be that when inactive < active that isolated can't be more than num_online_cpus() * 2 (i.e. one compactor per online cpu). diff --git a/mm/compaction.c b/mm/compaction.c index 94cce51..1e000b7 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -215,14 +215,16 @@ static void acct_isolated(struct zone *zone, struct compact_control *cc) static bool too_many_isolated(struct zone *zone) { - unsigned long inactive, isolated; + unsigned long active, inactive, isolated; + active = zone_page_state(zone, NR_ACTIVE_FILE) + + zone_page_state(zone, NR_INACTIVE_ANON); inactive = zone_page_state(zone, NR_INACTIVE_FILE) + zone_page_state(zone, NR_INACTIVE_ANON); isolated = zone_page_state(zone, NR_ISOLATED_FILE) + zone_page_state(zone, NR_ISOLATED_ANON); - return isolated > inactive; + return (inactive > active) ? isolated > inactive : false; } /* -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>