On Tue, 2013-08-27 at 17:37 +0800, Tang Chen wrote: > After memblock is ready, before SRAT is parsed, we should allocate memory > near the kernel image. So this patch does the following: > > 1. After memblock is ready, make memblock allocate memory from low address > to high, and set the lowest limit to the end of kernel image. > 2. After SRAT is parsed, make memblock behave as default, allocate memory > from high address to low, and reset the lowest limit to 0. > > This behavior is controlled by movablenode boot option. > > Signed-off-by: Tang Chen <tangchen@xxxxxxxxxxxxxx> > Reviewed-by: Zhang Yanfei <zhangyanfei@xxxxxxxxxxxxxx> > --- > arch/x86/kernel/setup.c | 37 +++++++++++++++++++++++++++++++++++++ > 1 files changed, 37 insertions(+), 0 deletions(-) > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > index fa7b5f0..0b35bbd 100644 > --- a/arch/x86/kernel/setup.c > +++ b/arch/x86/kernel/setup.c > @@ -1087,6 +1087,31 @@ void __init setup_arch(char **cmdline_p) > trim_platform_memory_ranges(); > trim_low_memory_range(); > > +#ifdef CONFIG_MOVABLE_NODE > + if (movablenode_enable_srat) { > + /* > + * Memory used by the kernel cannot be hot-removed because Linux cannot > + * migrate the kernel pages. When memory hotplug is enabled, we should > + * prevent memblock from allocating memory for the kernel. > + * > + * ACPI SRAT records all hotpluggable memory ranges. But before SRAT is > + * parsed, we don't know about it. > + * > + * The kernel image is loaded into memory at very early time. We cannot > + * prevent this anyway. So on NUMA system, we set any node the kernel > + * resides in as un-hotpluggable. > + * > + * Since on modern servers, one node could have double-digit gigabytes > + * memory, we can assume the memory around the kernel image is also Memory hotplug can be supported on virtualized environments, and we should allow using SRAT on them as a next step. In such environments, memory hotplug will be performed on per memory device object basis for workload balancing, and double-digit gigabytes is unlikely the case for now. So, I'd suggest it should instead state that all allocations are kept small until SRAT is pursed. > + * un-hotpluggable. So before SRAT is parsed, just allocate memory near > + * the kernel image to try the best to keep the kernel away from > + * hotpluggable memory. > + */ > + memblock_set_current_order(MEMBLOCK_ORDER_LOW_TO_HIGH); > + memblock_set_current_limit_low(__pa_symbol(_end)); > + } > +#endif /* CONFIG_MOVABLE_NODE */ Should the above block be put into init_mem_mapping() since it is memblock initialization? It is good to have some concise comments here, though. > + > init_mem_mapping(); > > early_trap_pf_init(); > @@ -1127,6 +1152,18 @@ void __init setup_arch(char **cmdline_p) > early_acpi_boot_init(); > > initmem_init(); > + > +#ifdef CONFIG_MOVABLE_NODE > + if (movablenode_enable_srat) { > + /* > + * When ACPI SRAT is parsed, which is done in initmem_init(), set > + * memblock back to the default behavior. > + */ > + memblock_set_current_order(MEMBLOCK_ORDER_DEFAULT); > + memblock_set_current_limit_low(0); > + } > +#endif /* CONFIG_MOVABLE_NODE */ Similarly, should this block be put into initmem_init() with some comment here? Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html