On Fri, Feb 14, 2014 at 07:34:17PM +0100, Bartlomiej Zolnierkiewicz wrote: > Pages allocated from MIGRATE_RESERVE migratetype pageblocks > are not freed back to MIGRATE_RESERVE migratetype free > lists in free_pcppages_bulk()->__free_one_page() if we got > to free_pcppages_bulk() through drain_[zone_]pages(). > The freeing through free_hot_cold_page() is okay because > freepage migratetype is set to pageblock migratetype before > calling free_pcppages_bulk(). If pages of MIGRATE_RESERVE > migratetype end up on the free lists of other migratetype > whole Reserved pageblock may be later changed to the other > migratetype in __rmqueue_fallback() and it will be never > changed back to be a Reserved pageblock. Fix the issue by > preserving freepage migratetype as a pageblock migratetype > (instead of overriding it to the requested migratetype) > for MIGRATE_RESERVE migratetype pages in rmqueue_bulk(). > > The problem was introduced in v2.6.31 by commit ed0ae21 > ("page allocator: do not call get_pageblock_migratetype() > more than necessary"). > > Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@xxxxxxxxxxx> > Reported-by: Yong-Taek Lee <ytk.lee@xxxxxxxxxxx> > Cc: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx> > Cc: Mel Gorman <mgorman@xxxxxxx> > Cc: Hugh Dickins <hughd@xxxxxxxxxx> > --- > v2: > - updated patch description, there is no __zone_pcp_update() > in newer kernels > > include/linux/mmzone.h | 5 +++++ > mm/page_alloc.c | 10 +++++++--- > 2 files changed, 12 insertions(+), 3 deletions(-) > > Index: b/include/linux/mmzone.h > =================================================================== > --- a/include/linux/mmzone.h 2014-02-14 18:59:08.177837747 +0100 > +++ b/include/linux/mmzone.h 2014-02-14 18:59:09.077837731 +0100 > @@ -63,6 +63,11 @@ enum { > MIGRATE_TYPES > }; > > +static inline bool is_migrate_reserve(int migratetype) > +{ > + return unlikely(migratetype == MIGRATE_RESERVE); > +} > + > #ifdef CONFIG_CMA > # define is_migrate_cma(migratetype) unlikely((migratetype) == MIGRATE_CMA) > #else > Index: b/mm/page_alloc.c > =================================================================== > --- a/mm/page_alloc.c 2014-02-14 18:59:08.185837746 +0100 > +++ b/mm/page_alloc.c 2014-02-14 18:59:09.077837731 +0100 > @@ -1174,7 +1174,7 @@ static int rmqueue_bulk(struct zone *zon > unsigned long count, struct list_head *list, > int migratetype, int cold) > { > - int mt = migratetype, i; > + int mt, i; > > spin_lock(&zone->lock); > for (i = 0; i < count; ++i) { > @@ -1195,9 +1195,13 @@ static int rmqueue_bulk(struct zone *zon > list_add(&page->lru, list); > else > list_add_tail(&page->lru, list); > + mt = get_pageblock_migratetype(page); > if (IS_ENABLED(CONFIG_CMA)) { > - mt = get_pageblock_migratetype(page); > - if (!is_migrate_cma(mt) && !is_migrate_isolate(mt)) > + if (!is_migrate_cma(mt) && !is_migrate_isolate(mt) && > + !is_migrate_reserve(mt)) > + mt = migratetype; > + } else { > + if (!is_migrate_reserve(mt)) > mt = migratetype; Minimally, this could be simplified because now it's an unconditional call to get_pageblock_migratetype. However, it looks like this could be improved without doing that. __rmqueue_fallback will be called if a page of the requested migratetype was not found. Furthermore, if a pageblock has been stolen then the pages are shuffled between free lists so you should be able to modify this patch to 1. have __rmqueue call set_freepage_migratetype(migratetype) if __rmqueue_smallest found a page 2. have __rmqueue_fallback call set_freepage_migratetype(new_type) when it has selected which freelist to select from. Can you check it out as an alternative to this patch please as it would have much less overhead than unconditionally calling get_pageblock_migratetype()? -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>