Commit 4ffeaf35 (mm: page_alloc: reduce cost of the fair zone allocation policy) arguably broke the fair zone allocation policy on UP with these hunks. a/mm/page_alloc.c b/mm/page_alloc.c @@ -1612,6 +1612,9 @@ again: } __mod_zone_page_state(zone, NR_ALLOC_BATCH, -(1 << order)); + if (zone_page_state(zone, NR_ALLOC_BATCH) == 0 && + !zone_is_fair_depleted(zone)) + zone_set_flag(zone, ZONE_FAIR_DEPLETED); __count_zone_vm_events(PGALLOC, zone, 1 << order); zone_statistics(preferred_zone, zone, gfp_flags); @@ -1966,8 +1985,10 @@ zonelist_scan: if (alloc_flags & ALLOC_FAIR) { if (!zone_local(preferred_zone, zone)) break; - if (zone_page_state(zone, NR_ALLOC_BATCH) <= 0) + if (zone_is_fair_depleted(zone)) { + nr_fair_skipped++; continue; + } } A <= check was replaced with a ==. On SMP it doesn't matter because negative values are returned as zero due to per-CPU drift which is not possible in the UP case. Vlastimil Babka correctly pointed out that this can wrap negative due to high-order allocations. However, Leon Romanovsky pointed out that a <= check on zone_page_state was never correct as zone_page_state returns unsigned long so the root cause of the breakage was the <= check in the first place. zone_page_state is an API hazard because of the difference in behaviour between SMP and UP is very surprising. There is a good reason to allow NR_ALLOC_BATCH to go negative -- when the counter is reset the negative value takes recent activity into account. This patch makes zone_page_state behave the same on SMP and UP as saving one branch on UP is not likely to make a measurable performance difference. Reported-by: Vlastimil Babka <vbabka@xxxxxxx> Reported-by: Leon Romanovsky <leon@xxxxxxx> Signed-off-by: Mel Gorman <mgorman@xxxxxxx> --- include/linux/vmstat.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h index 82e7db7..cece0f0 100644 --- a/include/linux/vmstat.h +++ b/include/linux/vmstat.h @@ -131,10 +131,8 @@ static inline unsigned long zone_page_state(struct zone *zone, enum zone_stat_item item) { long x = atomic_long_read(&zone->vm_stat[item]); -#ifdef CONFIG_SMP if (x < 0) x = 0; -#endif return x; } -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html