Hi Bart, On Thu, Sep 13, 2012 at 04:21:42PM +0200, Bartlomiej Zolnierkiewicz wrote: > On Thursday 06 September 2012 18:34:35 Bartlomiej Zolnierkiewicz wrote: > > > > Hi, > > > > On Thursday 06 September 2012 04:53:38 Minchan Kim wrote: > > > Normally, MIGRATE_ISOLATE type is used for memory-hotplug. > > > But it's irony type because the pages isolated would exist > > > as free page in free_area->free_list[MIGRATE_ISOLATE] so people > > > can think of it as allocatable pages but it is *never* allocatable. > > > It ends up confusing NR_FREE_PAGES vmstat so it would be > > > totally not accurate so some of place which depend on such vmstat > > > could reach wrong decision by the context. > > > > > > There were already report about it.[1] > > > [1] 702d1a6e, memory-hotplug: fix kswapd looping forever problem > > > > > > Then, there was other report which is other problem.[2] > > > [2] http://www.spinics.net/lists/linux-mm/msg41251.html > > > > > > I believe it can make problems in future, too. > > > So I hope removing such irony type by another design. > > > > > > I hope this patch solves it and let's revert [1] and doesn't need [2]. > > For our needs (CMA) patch [2] is much simpler / less intrusive way > to have correct NR_FREE_PAGES counter than this patch and currently > I would prefer to have it merged upstream instead of this one. I agree my patch could be somewhat big change so I will take things easy. Of course, it shouldn't prevent your patch merge if yours make sense. Afterward, if this patch solves all issues and better than other band-aid, then, we can revert. Shortly, I will review yours. > > > > * Changelog v1 > > > * Fix from Michal's many suggestion > > > > > > Cc: Michal Nazarewicz <mina86@xxxxxxxxxx> > > > Cc: Mel Gorman <mel@xxxxxxxxx> > > > Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> > > > Cc: Yasuaki Ishimatsu <isimatu.yasuaki@xxxxxxxxxxxxxx> > > > Cc: Wen Congyang <wency@xxxxxxxxxxxxxx> > > > Cc: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> > > > Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx> > > > --- > > > It's very early version which show the concept so I still marked it with RFC. > > > I just tested it with simple test and works. > > > This patch is needed indepth review from memory-hotplug guys from fujitsu > > > because I saw there are lots of patches recenlty they sent to about > > > memory-hotplug change. Please take a look at this patch. > > > > [...] > > > > > @@ -948,8 +954,13 @@ static int move_freepages(struct zone *zone, > > > } > > > > > > order = page_order(page); > > > - list_move(&page->lru, > > > - &zone->free_area[order].free_list[migratetype]); > > > + if (migratetype != MIGRATE_ISOLATE) { > > > + list_move(&page->lru, > > > + &zone->free_area[order].free_list[migratetype]); > > > + } else { > > > + list_del(&page->lru); > > > + isolate_free_page(page, order); > > > + } > > > page += 1 << order; > > > pages_moved += 1 << order; > > > } > > > > Shouldn't NR_FREE_PAGES counter be decreased somewhere above? > > > > [ I can see that it is not modified in __free_pages_ok() and > > free_hot_cold_page() because page is still counted as non-free one but > > here situation is different AFAICS. ] > > > > I tested the patch locally here with CONFIG_CMA=y and it causes some > > major problems for CMA (multiple errors from dma_alloc_from_contiguous() > > about memory ranges being busy and allocation failures). > > > > [ I'm sorry that I don't know more details yet but the issue should be > > easily reproducible. ] > > We spent some more time on the issue and it seems that the approach > taken in the patch (removal of MIGRATE_ISOLATE free_list) is currently > incompatible with CMA. > > In alloc_contig_range() we have: > > order = 0; > outer_start = start; > while (!PageBuddy(pfn_to_page(outer_start))) { > if (++order >= MAX_ORDER) { > ret = -EBUSY; > goto done; > } > outer_start &= ~0UL << order; > } > > for handling cases when the CMA area begins inside the higher order > page from buddy (that got already isolated). Unfortunately this code > no longer works as isolated pages are no longer hold in buddy allocator > (isolate_free_page() clears buddy bit). > > The other part of code that is probably affected by your patch is: > > /* Grab isolated pages from freelists. */ > outer_end = isolate_freepages_range(outer_start, end); > if (!outer_end) { > ret = -EBUSY; > goto done; > } > > also in alloc_contig_range(). isolate_freepages_range() calls > isolate_freepages_block() which assume that free pages (in isolated > pageblock) are in buddy allocator: > > if (!PageBuddy(page)) { > if (strict) > return 0; > continue; > } > > (which is no longer true) and also calls split_free_page() that > attempts to remove page from the free_list & buddy: > > /* Remove page from free list */ > list_del(&page->lru); > zone->free_area[order].nr_free--; > rmv_page_order(page); > > (the isolated page is on the isolated_pages list instead). Thanks for detailed pointing out, Bart! I will revisit this issues after I will fisnish other jobs. -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>