On 2016/3/23 17:40, Vlastimil Babka wrote: > Hanjun Guo has reported that a CMA stress test causes broken accounting of > CMA and free pages: > >> Before the test, I got: >> -bash-4.3# cat /proc/meminfo | grep Cma >> CmaTotal: 204800 kB >> CmaFree: 195044 kB >> >> >> After running the test: >> -bash-4.3# cat /proc/meminfo | grep Cma >> CmaTotal: 204800 kB >> CmaFree: 6602584 kB >> >> So the freed CMA memory is more than total.. >> >> Also the the MemFree is more than mem total: >> >> -bash-4.3# cat /proc/meminfo >> MemTotal: 16342016 kB >> MemFree: 22367268 kB >> MemAvailable: 22370528 kB > Laura Abbott has confirmed the issue and suspected the freepage accounting > rewrite around 3.18/4.0 by Joonsoo Kim. Joonsoo had a theory that this is > caused by unexpected merging between MIGRATE_ISOLATE and MIGRATE_CMA > pageblocks: > >> CMA isolates MAX_ORDER aligned blocks, but, during the process, >> partialy isolated block exists. If MAX_ORDER is 11 and >> pageblock_order is 9, two pageblocks make up MAX_ORDER >> aligned block and I can think following scenario because pageblock >> (un)isolation would be done one by one. >> >> (each character means one pageblock. 'C', 'I' means MIGRATE_CMA, >> MIGRATE_ISOLATE, respectively. >> >> CC -> IC -> II (Isolation) >> II -> CI -> CC (Un-isolation) >> >> If some pages are freed at this intermediate state such as IC or CI, >> that page could be merged to the other page that is resident on >> different type of pageblock and it will cause wrong freepage count. > This was supposed to be prevented by CMA operating on MAX_ORDER blocks, but > since it doesn't hold the zone->lock between pageblocks, a race window does > exist. > > It's also likely that unexpected merging can occur between MIGRATE_ISOLATE > and non-CMA pageblocks. This should be prevented in __free_one_page() since > commit 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated > pageblock"). However, we only check the migratetype of the pageblock where > buddy merging has been initiated, not the migratetype of the buddy pageblock > (or group of pageblocks) which can be MIGRATE_ISOLATE. > > Joonsoo has suggested checking for buddy migratetype as part of > page_is_buddy(), but that would add extra checks in allocator hotpath and > bloat-o-meter has shown significant code bloat (the function is inline). > > This patch reduces the bloat at some expense of more complicated code. The > buddy-merging while-loop in __free_one_page() is initially bounded to > pageblock_border and without any migratetype checks. The checks are placed > outside, bumping the max_order if merging is allowed, and returning to the > while-loop with a statement which can't be possibly considered harmful. > > This fixes the accounting bug and also removes the arguably weird state in the > original commit 3c605096d315 where buddies could be left unmerged. > > Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock") > Link: https://lkml.org/lkml/2016/3/2/280 > Reported-by: Hanjun Guo <guohanjun@xxxxxxxxxx> With the same stress test case (alloc/free cma) running for more than one hour, the bug I reported is gone. Tested-by: Hanjun Guo <guohanjun@xxxxxxxxxx> Thanks for debugging! Hanjun -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html