The patch titled ia64 node hotplug: fix up cpu-to-node relationship has been removed from the -mm tree. Its filename is node-hotplug-fixup-cpu-to-node.patch This patch was dropped because it is obsolete ------------------------------------------------------ Subject: ia64 node hotplug: fix up cpu-to-node relationship From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> I tested node hoplug with NUMA machine whcih equips physical node-hotplug. And found panic (see below). This is bacause node_to_cpu_mask is not updated to valid value. Patch is attached. == Sep 21 14:15:26 drpq kernel: [<a00000010000c700>] __ia64_leave_kernel+0x0/0x280 Sep 21 14:15:26 drpq kernel: sp=e00001401d426d80 bsp=e00001401d421618 Sep 21 14:15:26 drpq kernel: [<a000000100062040>] sd_degenerate+0x80/0xe0 Sep 21 14:15:26 drpq kernel: sp=e00001401d426f50 bsp=e00001401d4215e0 Sep 21 14:15:26 drpq kernel: [<a000000100065710>] cpu_attach_domain+0x90/0x1e0 Sep 21 14:15:26 drpq kernel: sp=e00001401d426f50 bsp=e00001401d421598 Sep 21 14:15:26 drpq kernel: [<a00000010006da20>] build_sched_domains+0x14e0/0x2360 Sep 21 14:15:26 drpq kernel: sp=e00001401d426f50 bsp=e00001401d4214c0 Sep 21 14:15:26 drpq kernel: [<a00000010006e8f0>] arch_init_sched_domains+0x50/0x80 Sep 21 14:15:26 drpq kernel: sp=e00001401d427da0 bsp=e00001401d4214a0 Sep 21 14:15:26 drpq kernel: [<a00000010006e9c0>] update_sched_domains+0xa0/0xe0 Sep 21 14:15:26 drpq kernel: sp=e00001401d427e20 bsp=e00001401d421478 Sep 21 14:15:26 drpq kernel: [<a0000001006201b0>] notifier_call_chain+0x50/0xc0 Sep 21 14:15:26 drpq kernel: sp=e00001401d427e20 bsp=e00001401d421440 Sep 21 14:15:26 drpq kernel: [<a00000010009ede0>] blocking_notifier_call_chain+0x40/0x80 Sep 21 14:15:26 drpq kernel: sp=e00001401d427e20 bsp=e00001401d421408 Sep 21 14:15:26 drpq kernel: [<a0000001000bccb0>] cpu_up+0x290/0x300 Sep 21 14:15:26 drpq kernel: sp=e00001401d427e20 bsp=e00001401d4213c8 Sep 21 14:15:27 drpq kernel: [<a0000001003d2b90>] store_online+0x70/0xe0 Sep 21 14:15:27 drpq kernel: sp=e00001401d427e20 bsp=e00001401d421398 Sep 21 14:15:27 drpq kernel: [<a0000001003c9e20>] sysdev_store+0x60/0xa0 Sep 21 14:15:27 drpq kernel: sp=e00001401d427e20 bsp=e00001401d421360 Sep 21 14:15:27 drpq kernel: [<a0000001001edfa0>] sysfs_write_file+0x240/0x2e0 Sep 21 14:15:27 drpq kernel: sp=e00001401d427e20 bsp=e00001401d421310 Sep 21 14:15:27 drpq kernel: [<a000000100156ae0>] vfs_write+0x200/0x3a0 Sep 21 14:15:27 drpq kernel: sp=e00001401d427e20 bsp=e00001401d4212c0 Sep 21 14:15:27 drpq kernel: [<a000000100157630>] sys_write+0x70/0xe0 Sep 21 14:15:27 drpq kernel: sp=e00001401d427e20 bsp=e00001401d421248 == When a cpu is not tied to its node at(before) cpu-onlining, the system panics. node_to_cpu_mask[] should be set to valid value before notifier of CPU_ONLINE is called.(if not, the system panics.) It can happen when a cpu is physically offlined at boot time (ia64). (*)See arch/ia64/kernel/numa.c. To bind a cpu to a node, we need physical_id of a cpu. But smp_build_cpu_map() in smpboot.c will not set physical_id of *not present* cpus. This patch updates node_to_cpumask() and cpu_to_nid_map[] just after onlining and offlining a cpu. Also fixes pxm_to_nid usage. acpi_map_pxm_to_nid() should be called here. Tested on a NUMA machine which supports hardware-node-hot-add. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> Cc: "Luck, Tony" <tony.luck@xxxxxxxxx> Cc: <stable@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxx> --- arch/ia64/kernel/acpi.c | 5 +++-- arch/ia64/kernel/smpboot.c | 18 ++++++++++++++++++ 2 files changed, 21 insertions(+), 2 deletions(-) diff -puN arch/ia64/kernel/acpi.c~node-hotplug-fixup-cpu-to-node arch/ia64/kernel/acpi.c --- a/arch/ia64/kernel/acpi.c~node-hotplug-fixup-cpu-to-node +++ a/arch/ia64/kernel/acpi.c @@ -771,14 +771,15 @@ int acpi_map_cpu2node(acpi_handle handle { #ifdef CONFIG_ACPI_NUMA int pxm_id; + int nid; pxm_id = acpi_get_pxm(handle); - + nid = acpi_map_pxm_to_node(pxm_id); /* * Assuming that the container driver would have set the proximity * domain and would have initialized pxm_to_node(pxm_id) && pxm_flag */ - node_cpuid[cpu].nid = (pxm_id < 0) ? 0 : pxm_to_node(pxm_id); + node_cpuid[cpu].nid = nid; node_cpuid[cpu].phys_id = physid; #endif diff -puN arch/ia64/kernel/smpboot.c~node-hotplug-fixup-cpu-to-node arch/ia64/kernel/smpboot.c --- a/arch/ia64/kernel/smpboot.c~node-hotplug-fixup-cpu-to-node +++ a/arch/ia64/kernel/smpboot.c @@ -59,6 +59,7 @@ #include <asm/system.h> #include <asm/tlbflush.h> #include <asm/unistd.h> +#include <asm/topology.h> #define SMP_DEBUG 0 @@ -377,6 +378,7 @@ smp_callin (void) int cpuid, phys_id, itc_master; extern void ia64_init_itm(void); extern volatile int time_keeper_id; + int nid; #ifdef CONFIG_PERFMON extern void pfm_init_percpu(void); @@ -396,6 +398,17 @@ smp_callin (void) lock_ipi_calllock(); cpu_set(cpuid, cpu_online_map); +#ifdef CONFIG_NUMA + nid = node_cpuid[cpuid].nid; + /* this can be removed when node-hptplug by cpu-hot-add is implemented */ + if (!node_online(nid)) + nid = 0; + /* see also build_cpu_to_node_map() in numa.c */ + if (!cpu_isset(cpuid, node_to_cpumask(nid))) { + cpu_set(cpuid, node_to_cpumask(nid)); + cpu_to_node_map[cpuid] = nid; + } +#endif unlock_ipi_calllock(); per_cpu(cpu_state, cpuid) = CPU_ONLINE; @@ -701,6 +714,7 @@ int migrate_platform_irqs(unsigned int c int __cpu_disable(void) { int cpu = smp_processor_id(); + int nid = cpu_to_node(cpu); /* * dont permit boot processor for now @@ -719,6 +733,10 @@ int __cpu_disable(void) remove_siblinginfo(cpu); cpu_clear(cpu, cpu_online_map); +#ifdef CONFIG_NUMA + cpu_clear(cpu, node_to_cpumask(nid)); + cpu_to_node_map[cpu] = 0; +#endif fixup_irqs(); local_flush_tlb_all(); cpu_clear(cpu, cpu_callin_map); _ Patches currently in -mm which might be from kamezawa.hiroyu@xxxxxxxxxxxxxx are node-hotplug-fixup-cpu-to-node.patch cpu-to-node-relationship-fixup-take2.patch hot-add-mem-x86_64-fixup-externs.patch hot-add-mem-x86_64-kconfig-changes.patch hot-add-mem-x86_64-enable-sparsemem-in-sratc.patch hot-add-mem-x86_64-memory_add_physaddr_to_nid-enable.patch hot-add-mem-x86_64-memory_add_physaddr_to_nid-node-fixup.patch hot-add-mem-x86_64-memory_add_physaddr_to_nid-node-fixup-fix.patch hot-add-mem-x86_64-use-config_memory_hotplug_sparse.patch hot-add-mem-x86_64-use-config_memory_hotplug_reserve.patch hot-add-mem-x86_64-use-config_memory_hotplug_reserve-fix.patch introduce-mechanism-for-registering-active-regions-of-memory.patch have-power-use-add_active_range-and-free_area_init_nodes.patch have-x86-use-add_active_range-and-free_area_init_nodes.patch have-x86-use-add_active_range-and-free_area_init_nodes-fix.patch have-x86_64-use-add_active_range-and-free_area_init_nodes.patch have-ia64-use-add_active_range-and-free_area_init_nodes.patch account-for-memmap-and-optionally-the-kernel-image-as-holes.patch account-for-holes-that-are-outside-the-range-of-physical-memory.patch allow-an-arch-to-expand-node-boundaries.patch proc-readdir-race-fix-take-3.patch proc-readdir-race-fix-take-3-fix-1.patch proc-readdir-race-fix-take-3-fix-2.patch namespaces-utsname-sysctl-hack.patch reiser4-hardirq-include-fix.patch - To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html