From: Changsheng Liu <liuchangcheng@xxxxxxxxxx>
After the user config CONFIG_MOVABLE_NODE,
When the memory is hot added, should_add_memory_movable() return 0
because all zones including movable zone are empty,
so the memory that was hot added will be added to the normal zone
and the normal zone will be created firstly.
But we want the whole node to be added to movable zone defaultly.
So we change should_add_memory_movable(): if the user config
CONFIG_MOVABLE_NODE and sysctl parameter hotadd_memory_as_movable and
the ZONE_NORMAL is empty or the pfn of the hot-added memory
is after the end of the ZONE_NORMAL it will always return 1
and all zones is empty at the same time,
so that the movable zone will be created firstly
and then the whole node will be added to movable zone defaultly.
If we want the node to be added to normal zone,
we can do it as follows:
"echo online_kernel > /sys/devices/system/memory/memoryXXX/state"
Signed-off-by: Changsheng Liu <liuchangsheng@xxxxxxxxxx>
Signed-off-by: Xiaofeng Yan <yanxiaofeng@xxxxxxxxxx>
Tested-by: Dongdong Fan <fandd@xxxxxxxxxx>
Cc: Wang Nan <wangnan0@xxxxxxxxxx>
Cc: Zhang Yanfei <zhangyanfei@xxxxxxxxxxxxxx>
Cc: Dave Hansen <dave.hansen@xxxxxxxxx>
Cc: Yinghai Lu <yinghai@xxxxxxxxxx>
Cc: Tang Chen <tangchen@xxxxxxxxxxxxxx>
Cc: Hu Tao <hutao@xxxxxxxxxxxxxx>
Cc: Lai Jiangshan <laijs@xxxxxxxxxxxxxx>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@xxxxxxxxxxxxxx>
Cc: Gu Zheng <guz.fnst@xxxxxxxxxxxxxx>
Cc: Toshi Kani <toshi.kani@xxxxxx>
Cc: Xishi Qiu <qiuxishi@xxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---
Documentation/memory-hotplug.txt | 5 ++++-
kernel/sysctl.c | 15 +++++++++++++++
mm/memory_hotplug.c | 23 +++++++++++++++++++++++
3 files changed, 42 insertions(+), 1 deletions(-)
diff --git a/Documentation/memory-hotplug.txt
b/Documentation/memory-hotplug.txt
index ce2cfcf..7e6b4f4 100644
--- a/Documentation/memory-hotplug.txt
+++ b/Documentation/memory-hotplug.txt
@@ -277,7 +277,7 @@ And if the memory block is in ZONE_MOVABLE, you
can change it to ZONE_NORMAL:
After this, memory block XXX's state will be 'online' and the
amount of
available memory will be increased.
-Currently, newly added memory is added as ZONE_NORMAL (for
powerpc, ZONE_DMA).
+Currently, newly added memory is added as ZONE_NORMAL or
ZONE_MOVABLE (for powerpc, ZONE_DMA).
This may be changed in future.
@@ -319,6 +319,9 @@ creates ZONE_MOVABLE as following.
Size of memory not for movable pages (not for offline) is
TOTAL - ZZZZ.
Size of memory for movable pages (for offline) is ZZZZ.
+And a sysctl parameter for assigning the hot added memory to
ZONE_MOVABLE is
+supported. If the value of "kernel/hotadd_memory_as_movable" is
1,the hot added
+memory will be assigned to ZONE_MOVABLE defautly.
Note: Unfortunately, there is no information to show which
memory block belongs
to ZONE_MOVABLE. This is TBD.
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 19b62b5..855c48e 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -166,6 +166,10 @@ extern int unaligned_dump_stack;
extern int no_unaligned_warning;
#endif
+#ifdef CONFIG_MOVABLE_NODE
+extern int hotadd_memory_as_movable;
+#endif
+
#ifdef CONFIG_PROC_SYSCTL
#define SYSCTL_WRITES_LEGACY -1
@@ -1139,6 +1143,17 @@ static struct ctl_table kern_table[] = {
.proc_handler = timer_migration_handler,
},
#endif
+/*If the value of "kernel/hotadd_memory_as_movable" is 1,the hot
added
+ * memory will be assigned to ZONE_MOVABLE defautly.*/
+#ifdef CONFIG_MOVABLE_NODE
+ {
+ .procname = "hotadd_memory_as_movable",
+ .data = &hotadd_memory_as_movable,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec,
+ },
+#endif
{ }
};
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 26fbba7..5bcaf74 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -37,6 +37,10 @@
#include "internal.h"
+/*If the global variable value is 1,
+ * the hot added memory will be assigned to ZONE_MOVABLE defautly*/
+int hotadd_memory_as_movable;
+
/*
* online_page_callback contains pointer to current page
onlining function.
* Initially it is generic_online_page(). If it is required it
could be
@@ -1190,6 +1194,9 @@ static int check_hotplug_memory_range(u64
start, u64 size)
/*
* If movable zone has already been setup, newly added memory
should be check.
* If its address is higher than movable zone, it should be
added as movable.
+ * And if system config CONFIG_MOVABLE_NODE and set the sysctl
parameter
+ * "hotadd_memory_as_movable" and added memory does not overlap
the zone
+ * before MOVABLE_ZONE,the memory is added as movable.
* Without this check, movable zone may overlap with other zone.
*/
static int should_add_memory_movable(int nid, u64 start, u64 size)
@@ -1197,6 +1204,22 @@ static int should_add_memory_movable(int
nid, u64 start, u64 size)
unsigned long start_pfn = start >> PAGE_SHIFT;
pg_data_t *pgdat = NODE_DATA(nid);
struct zone *movable_zone = pgdat->node_zones + ZONE_MOVABLE;
+ struct zone *pre_zone = pgdat->node_zones + (ZONE_MOVABLE - 1);
+ /*
+ * The system configs CONFIG_MOVABLE_NODE to assign a node
+ * which has only movable memory,so the hot-added memory should
+ * be assigned to ZONE_MOVABLE defaultly,
+ * but the function zone_for_memory() assign the hot-added memory
+ * to ZONE_NORMAL(x86_64) defaultly.Kernel does not allow to
+ * create ZONE_MOVABLE before ZONE_NORMAL,so If the value of
+ * sysctl parameter "hotadd_memory_as_movable" is 1
+ * and the ZONE_NORMAL is empty or the pfn of the hot-added
memory
+ * is after the end of the ZONE_NORMAL
+ * the hot-added memory will be assigned to ZONE_MOVABLE.
+ */
+ if (hotadd_memory_as_movable
+ && (zone_is_empty(pre_zone) || zone_end_pfn(pre_zone) <=
start_pfn))
+ return 1;
if (zone_is_empty(movable_zone))
return 0;