From: Tang Chen <tangchen@xxxxxxxxxxxxxx> Linux kernel cannot migrate pages used by the kernel. As a result, hotpluggable memory used by the kernel won't be able to be hot-removed. To solve this problem, the basic idea is to prevent memblock from allocating hotpluggable memory for the kernel at early time, and arrange all hotpluggable memory in ACPI SRAT(System Resource Affinity Table) as ZONE_MOVABLE when initializing zones. In the previous patches, we have marked hotpluggable memory regions with MEMBLOCK_HOTPLUG flag in memblock.memory. In this patch, we make memblock skip these hotpluggable memory regions in the default allocate function. memblock_find_in_range_node() |-->for_each_free_mem_range_reverse() |-->__next_free_mem_range_rev() The above is the only place where __next_free_mem_range_rev() is used. So skip hotpluggable memory regions when iterating memblock.memory to find free memory. In the later patches, a boot option named "movablenode" will be introduced to enable/disable using SRAT to arrange ZONE_MOVABLE. NOTE: This check will always be done. It is OK because if users didn't specify movablenode option, the hotpluggable memory won't be marked. So this check won't skip any memory, which means the kernel will act as before. Signed-off-by: Tang Chen <tangchen@xxxxxxxxxxxxxx> Reviewed-by: Zhang Yanfei <zhangyanfei@xxxxxxxxxxxxxx> --- mm/memblock.c | 8 ++++++++ 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/mm/memblock.c b/mm/memblock.c index d8a9420..9bdebfb 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -819,6 +819,10 @@ void __init_memblock __next_free_mem_range(u64 *idx, int nid, * @out_nid: ptr to int for nid of the range, can be %NULL * * Reverse of __next_free_mem_range(). + * + * Linux kernel cannot migrate pages used by itself. Memory hotplug users won't + * be able to hot-remove hotpluggable memory used by the kernel. So this + * function skip hotpluggable regions when allocating memory for the kernel. */ void __init_memblock __next_free_mem_range_rev(u64 *idx, int nid, phys_addr_t *out_start, @@ -843,6 +847,10 @@ void __init_memblock __next_free_mem_range_rev(u64 *idx, int nid, if (nid != MAX_NUMNODES && nid != memblock_get_region_node(m)) continue; + /* skip hotpluggable memory regions */ + if (m->flags & MEMBLOCK_HOTPLUG) + continue; + /* scan areas before each reservation for intersection */ for ( ; ri >= 0; ri--) { struct memblock_region *r = &rsv->regions[ri]; -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html