On 03/20/2017 07:33 AM, Joonsoo Kim wrote: >> The fact sticky movable pageblocks aren't ideal for CMA doesn't mean >> they're not ideal for memory hotunplug though. >> >> With CMA there's no point in having the sticky movable pageblocks >> scattered around and it's purely a misfeature to use sticky movable >> pageblocks because you need the whole CMA area contiguous hence a >> ZONE_CMA is ideal. > No. CMA ranges could be registered many times for each devices and they > could be scattered due to device's H/W limitation. So, current implementation > in kernel, MIGRATE_CMA pageblocks, are scattered sometimes. > >> As opposed with memory hotplug the sticky movable pageblocks would >> allow the kernel to satisfy the current /sys API and they would >> provide no downside unlike in the CMA case where the size of the >> allocation is unknown. > No, same downside also exists in this case. Downside is not related to the case > that device uses that range. It is related to VM management to this range and > problems are the same. For example, with sticky movable pageblock, we need to > subtract number of freepages in sticky movable pageblock when watermark is > checked for non-movable allocation and it causes some problems. Agree. Right now for CMA we have to account NR_FREE_CMA_PAGES (number of free pages within MIGRATE_CMA pageblocks), which brings all those hooks and other troubles for keep the accounting precise (there used to be various races in there). This goes against the rest of page grouping by mobility design, which wasn't meant to be precise for performance reasons (e.g. when you change pageblock type and move pages between freelists, any pcpu cached pages are left at their previous type's list). We also can't ignore this accounting, as then the watermark check could then pass for e.g. UNMOVABLE allocation, which would proceed to find that the only free pages available are within the MIGRATE_CMA (or sticky-movable) pageblocks, where it's not allowed to fallback to. If only then we went reclaiming, the zone balance checks would also consider the zone balanced, even though unmovable allocations would still not be possible. Even with this extra accounting, things are not perfect, because reclaim doesn't guarantee freeing the pages in the right pageblocks, so we can easily overreclaim. That's mainly why I agreed that ZONE_CMA should be better than the current implementation, and I'm skeptical about the sticky-movable pageblock idea. Note the conversion to node-lru reclaim has changed things somewhat, as we can't reclaim a single zone anymore, but the accounting troubles remain. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>