On Fri, Mar 18, 2016 at 03:10:09PM +0100, Vlastimil Babka wrote: > On 03/17/2016 04:52 PM, Joonsoo Kim wrote: > > 2016-03-18 0:43 GMT+09:00 Vlastimil Babka <vbabka@xxxxxxx>: > >>>>>> > >>>>>> Okay. I used following slightly optimized version and I need to > >>>>>> add 'max_order = min_t(unsigned int, MAX_ORDER, pageblock_order + 1)' > >>>>>> to yours. Please consider it, too. > >>>>> > >>>>> Hmm, this one is not work, I still can see the bug is there after > >>>>> applying > >>>>> this patch, did I miss something? > >>>> > >>>> I may find that there is a bug which was introduced by me some time > >>>> ago. Could you test following change in __free_one_page() on top of > >>>> Vlastimil's patch? > >>>> > >>>> -page_idx = pfn & ((1 << max_order) - 1); > >>>> +page_idx = pfn & ((1 << MAX_ORDER) - 1); > >>> > >>> > >>> I tested Vlastimil's patch + your change with stress for more than half > >>> hour, the bug > >>> I reported is gone :) > >> > >> > >> Oh, ok, will try to send proper patch, once I figure out what to write in > >> the changelog :) > > > > Thanks in advance! > > > > OK, here it is. Hanjun can you please retest this, as I'm not sure if you had > the same code due to the followup one-liner patches in the thread. Lucas, see if > it helps with your issue as well. Laura and Joonsoo, please also test and review > and check changelog if my perception of the problem is accurate :) > > Thanks > > ----8<---- > From: Vlastimil Babka <vbabka@xxxxxxx> > Date: Fri, 18 Mar 2016 14:22:31 +0100 > Subject: [PATCH] mm/page_alloc: prevent merging between isolated and other > pageblocks > > Hanjun Guo has reported that a CMA stress test causes broken accounting of > CMA and free pages: > > > Before the test, I got: > > -bash-4.3# cat /proc/meminfo | grep Cma > > CmaTotal: 204800 kB > > CmaFree: 195044 kB > > > > > > After running the test: > > -bash-4.3# cat /proc/meminfo | grep Cma > > CmaTotal: 204800 kB > > CmaFree: 6602584 kB > > > > So the freed CMA memory is more than total.. > > > > Also the the MemFree is more than mem total: > > > > -bash-4.3# cat /proc/meminfo > > MemTotal: 16342016 kB > > MemFree: 22367268 kB > > MemAvailable: 22370528 kB > > Laura Abbott has confirmed the issue and suspected the freepage accounting > rewrite around 3.18/4.0 by Joonsoo Kim. Joonsoo had a theory that this is > caused by unexpected merging between MIGRATE_ISOLATE and MIGRATE_CMA > pageblocks: > > > CMA isolates MAX_ORDER aligned blocks, but, during the process, > > partialy isolated block exists. If MAX_ORDER is 11 and > > pageblock_order is 9, two pageblocks make up MAX_ORDER > > aligned block and I can think following scenario because pageblock > > (un)isolation would be done one by one. > > > > (each character means one pageblock. 'C', 'I' means MIGRATE_CMA, > > MIGRATE_ISOLATE, respectively. > > > > CC -> IC -> II (Isolation) > > II -> CI -> CC (Un-isolation) > > > > If some pages are freed at this intermediate state such as IC or CI, > > that page could be merged to the other page that is resident on > > different type of pageblock and it will cause wrong freepage count. > > This was supposed to be prevented by CMA operating on MAX_ORDER blocks, but > since it doesn't hold the zone->lock between pageblocks, a race window does > exist. > > It's also likely that unexpected merging can occur between MIGRATE_ISOLATE > and non-CMA pageblocks. This should be prevented in __free_one_page() since > commit 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated > pageblock"). However, we only check the migratetype of the pageblock where > buddy merging has been initiated, not the migratetype of the buddy pageblock > (or group of pageblocks) which can be MIGRATE_ISOLATE. > > Joonsoo has suggested checking for buddy migratetype as part of > page_is_buddy(), but that would add extra checks in allocator hotpath and > bloat-o-meter has shown significant code bloat (the function is inline). > > This patch reduces the bloat at some expense of more complicated code. The > buddy-merging while-loop in __free_one_page() is initially bounded to > pageblock_border and without any migratetype checks. The checks are placed > outside, bumping the max_order if merging is allowed, and returning to the > while-loop with a statement which can't be possibly considered harmful. > > This fixes the accounting bug and also removes the arguably weird state in the > original commit 3c605096d315 where buddies could be left unmerged. > > Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock") > Link: https://lkml.org/lkml/2016/3/2/280 > Reported-by: Hanjun Guo <guohanjun@xxxxxxxxxx> > Debugged-by: Laura Abbott <labbott@xxxxxxxxxx> > Debugged-by: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> > Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> # 3.18+ > --- > mm/page_alloc.c | 46 +++++++++++++++++++++++++++++++++------------- > 1 file changed, 33 insertions(+), 13 deletions(-) Acked-by: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> Thanks for taking care of this issue!. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>