Re: [PATCH v2 08/18] x86: get pg_data_t's memory from other node

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2013-08-01 at 15:06 +0800, Tang Chen wrote:
> From: Yasuaki Ishimatsu <isimatu.yasuaki@xxxxxxxxxxxxxx>
> 
> If system can create movable node which all memory of the node is allocated
> as ZONE_MOVABLE, setup_node_data() cannot allocate memory for the node's
> pg_data_t. So, use memblock_alloc_try_nid() instead of memblock_alloc_nid()
> to retry when the first allocation fails. Otherwise, the system could failed
> to boot.
> 
> The node_data could be on hotpluggable node. And so could pagetable and
> vmemmap. But for now, doing so will break memory hot-remove path.
> 
> A node could have several memory devices. And the device who holds node
> data should be hot-removed in the last place. But in NUAM level, we don't

NUAM -> NUMA

> know which memory_block (/sys/devices/system/node/nodeX/memoryXXX) belongs
> to which memory device. We only have node. So we can only do node hotplug.
> 
> But in virtualization, developers are now developing memory hotplug in qemu,
> which support a single memory device hotplug. So a whole node hotplug will
> not satisfy virtualization users.
> 
> So at last, we concluded that we'd better do memory hotplug and local node
> things (local node node data, pagetable, vmemmap, ...) in two steps.
> Please refer to https://lkml.org/lkml/2013/6/19/73
> 
> For now, we put node_data of movable node to another node, and then improve
> it in the future.
> 
> In the later patches, a boot option will be introduced to enable/disable this
> functionality. If users disable it, the node_data will still be put on the
> local node.
> 
> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@xxxxxxxxxxxxxx>
> Signed-off-by: Lai Jiangshan <laijs@xxxxxxxxxxxxxx>
> Signed-off-by: Tang Chen <tangchen@xxxxxxxxxxxxxx>
> Signed-off-by: Jiang Liu <jiang.liu@xxxxxxxxxx>
> Reviewed-by: Wanpeng Li <liwanp@xxxxxxxxxxxxxxxxxx>
> Reviewed-by: Zhang Yanfei <zhangyanfei@xxxxxxxxxxxxxx>

Acked-by: Toshi Kani <toshi.kani@xxxxxx>

Thanks,
-Toshi


> ---
>  arch/x86/mm/numa.c |    5 ++---
>  1 files changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
> index a71c4e2..5013583 100644
> --- a/arch/x86/mm/numa.c
> +++ b/arch/x86/mm/numa.c
> @@ -209,10 +209,9 @@ static void __init setup_node_data(int nid, u64 start, u64 end)
>  	 * Allocate node data.  Try node-local memory and then any node.
>  	 * Never allocate in DMA zone.
>  	 */
> -	nd_pa = memblock_alloc_nid(nd_size, SMP_CACHE_BYTES, nid);
> +	nd_pa = memblock_alloc_try_nid(nd_size, SMP_CACHE_BYTES, nid);
>  	if (!nd_pa) {
> -		pr_err("Cannot find %zu bytes in node %d\n",
> -		       nd_size, nid);
> +		pr_err("Cannot find %zu bytes in any node\n", nd_size);
>  		return;
>  	}
>  	nd = __va(nd_pa);


--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux