On Fri, 24 Jun 2011 15:44:54 +0100 Mel Gorman <mgorman@xxxxxxx> wrote: > During allocator-intensive workloads, kswapd will be woken frequently > causing free memory to oscillate between the high and min watermark. > This is expected behaviour. > > A problem occurs if the highest zone is small. balance_pgdat() > only considers unreclaimable zones when priority is DEF_PRIORITY > but sleeping_prematurely considers all zones. It's possible for this > sequence to occur > > 1. kswapd wakes up and enters balance_pgdat() > 2. At DEF_PRIORITY, marks highest zone unreclaimable > 3. At DEF_PRIORITY-1, ignores highest zone setting end_zone > 4. At DEF_PRIORITY-1, calls shrink_slab freeing memory from > highest zone, clearing all_unreclaimable. Highest zone > is still unbalanced > 5. kswapd returns and calls sleeping_prematurely > 6. sleeping_prematurely looks at *all* zones, not just the ones > being considered by balance_pgdat. The highest small zone > has all_unreclaimable cleared but but the zone is not > balanced. all_zones_ok is false so kswapd stays awake > > This patch corrects the behaviour of sleeping_prematurely to check > the zones balance_pgdat() checked. But kswapd is making progress: it's reclaiming slab. Eventually that won't work any more and all_unreclaimable will not be cleared and the condition will fix itself up? btw, if (!sleeping_prematurely(...)) sleep(); hurts my brain. My brain would prefer if (kswapd_should_sleep(...)) sleep(); no? > Reported-and-tested-by: Pádraig Brady <P@xxxxxxxxxxxxxx> But what were the before-and-after observations? I don't understand how this can cause a permanent cpuchew by kswapd. > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2323,7 +2323,7 @@ static bool sleeping_prematurely(pg_data_t *pgdat, int order, long remaining, > return true; > > /* Check the watermark levels */ > - for (i = 0; i < pgdat->nr_zones; i++) { > + for (i = 0; i <= classzone_idx; i++) { > struct zone *zone = pgdat->node_zones + i; > > if (!populated_zone(zone)) The patch looks sensible. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href