On Wed, Mar 23 2016, Vlastimil Babka wrote: > Hanjun Guo has reported that a CMA stress test causes broken accounting of > CMA and free pages: > >> Before the test, I got: >> -bash-4.3# cat /proc/meminfo | grep Cma >> CmaTotal: 204800 kB >> CmaFree: 195044 kB >> >> >> After running the test: >> -bash-4.3# cat /proc/meminfo | grep Cma >> CmaTotal: 204800 kB >> CmaFree: 6602584 kB >> >> So the freed CMA memory is more than total.. >> >> Also the the MemFree is more than mem total: >> >> -bash-4.3# cat /proc/meminfo >> MemTotal: 16342016 kB >> MemFree: 22367268 kB >> MemAvailable: 22370528 kB > > Laura Abbott has confirmed the issue and suspected the freepage accounting > rewrite around 3.18/4.0 by Joonsoo Kim. Joonsoo had a theory that this is > caused by unexpected merging between MIGRATE_ISOLATE and MIGRATE_CMA > pageblocks: > >> CMA isolates MAX_ORDER aligned blocks, but, during the process, >> partialy isolated block exists. If MAX_ORDER is 11 and >> pageblock_order is 9, two pageblocks make up MAX_ORDER >> aligned block and I can think following scenario because pageblock >> (un)isolation would be done one by one. >> >> (each character means one pageblock. 'C', 'I' means MIGRATE_CMA, >> MIGRATE_ISOLATE, respectively. >> >> CC -> IC -> II (Isolation) >> II -> CI -> CC (Un-isolation) >> >> If some pages are freed at this intermediate state such as IC or CI, >> that page could be merged to the other page that is resident on >> different type of pageblock and it will cause wrong freepage count. > > This was supposed to be prevented by CMA operating on MAX_ORDER blocks, but > since it doesn't hold the zone->lock between pageblocks, a race window does > exist. > > It's also likely that unexpected merging can occur between MIGRATE_ISOLATE > and non-CMA pageblocks. This should be prevented in __free_one_page() since > commit 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated > pageblock"). However, we only check the migratetype of the pageblock where > buddy merging has been initiated, not the migratetype of the buddy pageblock > (or group of pageblocks) which can be MIGRATE_ISOLATE. > > Joonsoo has suggested checking for buddy migratetype as part of > page_is_buddy(), but that would add extra checks in allocator hotpath and > bloat-o-meter has shown significant code bloat (the function is inline). > > This patch reduces the bloat at some expense of more complicated code. The > buddy-merging while-loop in __free_one_page() is initially bounded to > pageblock_border and without any migratetype checks. The checks are placed > outside, bumping the max_order if merging is allowed, and returning to the > while-loop with a statement which can't be possibly considered harmful. > > This fixes the accounting bug and also removes the arguably weird state in the > original commit 3c605096d315 where buddies could be left unmerged. > > Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock") > Link: https://lkml.org/lkml/2016/3/2/280 > Reported-by: Hanjun Guo <guohanjun@xxxxxxxxxx> > Debugged-by: Laura Abbott <labbott@xxxxxxxxxx> > Debugged-by: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> > Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> > Cc: "Kirill A. Shutemov" <kirill@xxxxxxxxxxxxx> > Cc: Johannes Weiner <hannes@xxxxxxxxxxx> > Cc: Minchan Kim <minchan@xxxxxxxxxx> > Cc: Yasuaki Ishimatsu <isimatu.yasuaki@xxxxxxxxxxxxxx> > Cc: Zhang Yanfei <zhangyanfei@xxxxxxxxxxxxxx> > Cc: Michal Nazarewicz <mina86@xxxxxxxxxx> Acked-by: Michal Nazarewicz <mina86@xxxxxxxxxx> > Cc: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> > Cc: "Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> # v3.18+ > Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx> I wonder if with this change, ret = start_isolate_page_range(pfn_max_align_down(start), pfn_max_align_up(end), migratetype, false); in alloc_contig_range could be loosen up to align to pageblocks instead of having to use max(pageblock, max_page). It feels that it should be possible, but on the other hand, I’m not certain how buddy allocator will behave if a max_order page spans MIGRATE_CMA and MIGRATE_ISOLATE pageblocks. I guess start_isolate_page_range would have to split such pages? > --- > mm/page_alloc.c | 46 +++++++++++++++++++++++++++++++++------------- > 1 file changed, 33 insertions(+), 13 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index c46b75d14b6f..b9785af4fae2 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -683,34 +683,28 @@ static inline void __free_one_page(struct page *page, > unsigned long combined_idx; > unsigned long uninitialized_var(buddy_idx); > struct page *buddy; > - unsigned int max_order = MAX_ORDER; > + unsigned int max_order; > + > + max_order = min_t(unsigned int, MAX_ORDER, pageblock_order + 1); > > VM_BUG_ON(!zone_is_initialized(zone)); > VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP, page); > > VM_BUG_ON(migratetype == -1); > - if (is_migrate_isolate(migratetype)) { > - /* > - * We restrict max order of merging to prevent merge > - * between freepages on isolate pageblock and normal > - * pageblock. Without this, pageblock isolation > - * could cause incorrect freepage accounting. > - */ > - max_order = min_t(unsigned int, MAX_ORDER, pageblock_order + 1); > - } else { > + if (likely(!is_migrate_isolate(migratetype))) > __mod_zone_freepage_state(zone, 1 << order, migratetype); > - } > > - page_idx = pfn & ((1 << max_order) - 1); > + page_idx = pfn & ((1 << MAX_ORDER) - 1); > > VM_BUG_ON_PAGE(page_idx & ((1 << order) - 1), page); > VM_BUG_ON_PAGE(bad_range(zone, page), page); > > +continue_merging: > while (order < max_order - 1) { > buddy_idx = __find_buddy_index(page_idx, order); > buddy = page + (buddy_idx - page_idx); > if (!page_is_buddy(page, buddy, order)) > - break; > + goto done_merging; > /* > * Our buddy is free or it is CONFIG_DEBUG_PAGEALLOC guard page, > * merge with it and move up one order. > @@ -727,6 +721,32 @@ static inline void __free_one_page(struct page *page, > page_idx = combined_idx; > order++; > } > + if (max_order < MAX_ORDER) { > + /* If we are here, it means order is >= pageblock_order. > + * We want to prevent merge between freepages on isolate > + * pageblock and normal pageblock. Without this, pageblock > + * isolation could cause incorrect freepage or CMA accounting. > + * > + * We don't want to hit this code for the more frequent > + * low-order merging. > + */ > + if (unlikely(has_isolate_pageblock(zone))) { > + int buddy_mt; > + > + buddy_idx = __find_buddy_index(page_idx, order); > + buddy = page + (buddy_idx - page_idx); > + buddy_mt = get_pageblock_migratetype(buddy); > + > + if (migratetype != buddy_mt > + && (is_migrate_isolate(migratetype) || > + is_migrate_isolate(buddy_mt))) > + goto done_merging; > + } > + max_order++; > + goto continue_merging; > + } > + > +done_merging: > set_page_order(page, order); > > /* > -- > 2.7.3 > -- Best regards ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ «If at first you don’t succeed, give up skydiving» -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html