On Thu, Jun 08, 2017 at 02:23:18PM +0200, Michal Hocko wrote:
movable_node kernel parameter allows to make hotplugable NUMA nodes to put all the hotplugable memory into movable zone which allows more or less reliable memory hotremove. At least this is the case for the NUMA nodes present during the boot (see find_zone_movable_pfns_for_nodes). This is not the case for the memory hotplug, though. echo online > /sys/devices/system/memory/memoryXYZ/status will default to a kernel zone (usually ZONE_NORMAL) unless the particular memblock is already in the movable zone range which is not the case normally when onlining the memory from the udev rule context for a freshly hotadded NUMA node. The only option currently is to have a special udev rule to echo online_movable to all memblocks belonging to such a node which is rather clumsy. Not the mention this is inconsistent as well because what ended up in the movable zone during the boot will end up in a kernel zone after hotremove & hotadd without special care. It would be nice to reuse memblock_is_hotpluggable but the runtime hotplug doesn't have that information available because the boot and hotplug paths are not shared and it would be really non trivial to make them use the same code path because the runtime hotplug doesn't play with the memblock allocator at all. Teach move_pfn_range that MMOP_ONLINE_KEEP can use the movable zone if movable_node is enabled and the range doesn't overlap with the existing normal zone. This should provide a reasonable default onlining strategy. Strictly speaking the semantic is not identical with the boot time initialization because find_zone_movable_pfns_for_nodes covers only the hotplugable range as described by the BIOS/FW. From my experience this is usually a full node though (except for Node0 which is special and never goes away completely). If this turns out to be a problem in the real life we can tweak the code to store hotplug flag into memblocks but let's keep this simple now. Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
Acked-by: Reza Arbab <arbab@xxxxxxxxxxxxxxxxxx> -- Reza Arbab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>