From: Changsheng Liu <liuchangcheng@xxxxxxxxxx> After the user config CONFIG_MOVABLE_NODE, When the memory is hot added, should_add_memory_movable() return 0 because all zones including ZONE_MOVABLE are empty, so the memory that was hot added will be assigned to ZONE_NORMAL and ZONE_NORMAL will be created firstly. But we want the whole node to be added to ZONE_MOVABLE by default. So we change should_add_memory_movable(): if the user config CONFIG_MOVABLE_NODE and sysctl parameter hotadd_memory_as_movable is 1 and the ZONE_NORMAL is empty or the pfn of the hot-added memory is after the end of the ZONE_NORMAL it will always return 1 and then the whole node will be added to ZONE_MOVABLE by default. If we want the node to be assigned to ZONE_NORMAL, we can do it as follows: "echo online_kernel > /sys/devices/system/memory/memoryXXX/state" By the patch, the behavious of kernel is changed by sysctl, user can automatically create movable memory by only the following udev rule: SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online" Signed-off-by: Changsheng Liu <liuchangsheng@xxxxxxxxxx> Signed-off-by: Xiaofeng Yan <yanxiaofeng@xxxxxxxxxx> Tested-by: Dongdong Fan <fandd@xxxxxxxxxx> Cc: Wang Nan <wangnan0@xxxxxxxxxx> Cc: Dave Hansen <dave.hansen@xxxxxxxxx> Cc: Yinghai Lu <yinghai@xxxxxxxxxx> Cc: Tang Chen <tangchen@xxxxxxxxxxxxxx> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@xxxxxxxxxxxxxx> Cc: Toshi Kani <toshi.kani@xxxxxx> Cc: Xishi Qiu <qiuxishi@xxxxxxxxxx> --- Documentation/memory-hotplug.txt | 5 ++++- kernel/sysctl.c | 15 +++++++++++++++ mm/memory_hotplug.c | 24 ++++++++++++++++++++++++ 3 files changed, 43 insertions(+), 1 deletions(-) diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt index ce2cfcf..7ac7485 100644 --- a/Documentation/memory-hotplug.txt +++ b/Documentation/memory-hotplug.txt @@ -277,7 +277,7 @@ And if the memory block is in ZONE_MOVABLE, you can change it to ZONE_NORMAL: After this, memory block XXX's state will be 'online' and the amount of available memory will be increased. -Currently, newly added memory is added as ZONE_NORMAL (for powerpc, ZONE_DMA). +Currently, newly added memory is added as ZONE_NORMAL or ZONE_MOVABLE (for powerpc, ZONE_DMA). This may be changed in future. @@ -319,6 +319,9 @@ creates ZONE_MOVABLE as following. Size of memory not for movable pages (not for offline) is TOTAL - ZZZZ. Size of memory for movable pages (for offline) is ZZZZ. +And a sysctl parameter for assigning the hot added memory to ZONE_MOVABLE is +supported. If the value of "kernel/hotadd_memory_as_movable" is 1,the hot added +memory will be assigned to ZONE_MOVABLE by default. Note: Unfortunately, there is no information to show which memory block belongs to ZONE_MOVABLE. This is TBD. diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 19b62b5..16b1501 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -166,6 +166,10 @@ extern int unaligned_dump_stack; extern int no_unaligned_warning; #endif +#ifdef CONFIG_MOVABLE_NODE +extern int hotadd_memory_as_movable; +#endif + #ifdef CONFIG_PROC_SYSCTL #define SYSCTL_WRITES_LEGACY -1 @@ -1139,6 +1143,17 @@ static struct ctl_table kern_table[] = { .proc_handler = timer_migration_handler, }, #endif +/*If the value of "kernel/hotadd_memory_as_movable" is 1,the hot added + * memory will be assigned to ZONE_MOVABLE by default.*/ +#ifdef CONFIG_MOVABLE_NODE + { + .procname = "hotadd_memory_as_movable", + .data = &hotadd_memory_as_movable, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec, + }, +#endif { } }; diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 26fbba7..eca5512 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -37,6 +37,11 @@ #include "internal.h" +/*If the global variable value is 1, + * the hot added memory will be assigned to ZONE_MOVABLE by default + */ +int hotadd_memory_as_movable; + /* * online_page_callback contains pointer to current page onlining function. * Initially it is generic_online_page(). If it is required it could be @@ -1190,6 +1195,9 @@ static int check_hotplug_memory_range(u64 start, u64 size) /* * If movable zone has already been setup, newly added memory should be check. * If its address is higher than movable zone, it should be added as movable. + * And if system config CONFIG_MOVABLE_NODE and set the sysctl parameter + * "hotadd_memory_as_movable" and added memory does not overlap the zone + * before MOVABLE_ZONE,the memory will be added as movable. * Without this check, movable zone may overlap with other zone. */ static int should_add_memory_movable(int nid, u64 start, u64 size) @@ -1197,6 +1205,22 @@ static int should_add_memory_movable(int nid, u64 start, u64 size) unsigned long start_pfn = start >> PAGE_SHIFT; pg_data_t *pgdat = NODE_DATA(nid); struct zone *movable_zone = pgdat->node_zones + ZONE_MOVABLE; + struct zone *pre_zone = pgdat->node_zones + (ZONE_MOVABLE - 1); + /* + * The system configs CONFIG_MOVABLE_NODE to assign a node + * which has only movable memory,so the hot-added memory should + * be assigned to ZONE_MOVABLE by default, + * but the function zone_for_memory() assign the hot-added memory + * to ZONE_NORMAL(x86_64) by default.Kernel does not allow to + * create ZONE_MOVABLE before ZONE_NORMAL,So if the value of + * sysctl parameter "hotadd_memory_as_movable" is 1 + * and the ZONE_NORMAL is empty or the pfn of the hot-added memory + * is after the end of ZONE_NORMAL + * the hot-added memory will be assigned to ZONE_MOVABLE. + */ + if (hotadd_memory_as_movable + && (zone_is_empty(pre_zone) || zone_end_pfn(pre_zone) <= start_pfn)) + return 1; if (zone_is_empty(movable_zone)) return 0; -- 1.7.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>