Hi All, My Simple Test Result: In our box: Fujitsu PQ2000 with 1 nodes for hot-plug. Before the patchset: +-------------------------------------+ | | | NUMA node0 CPU: 0-23,256-279 +------+ | NUMA node1 CPU: 24-47,280-303 | | | | | +-------------------------------------+ | Hot-plug +-------------------------------------+ + | | | | NUMA Node0: 0-23, 256-279 <------+ | NUMA Node1: 24-47, 280-303 | | NUMA Node2: 64|69, 72-77, 80-85, 88-93... | NUMA Node3: 96-101, 104-109, 112-117,... | | | +-------------------------------------+ | Hot-remove +-------------------------------------+ | | | | | NUMA node0 CPU: 0-23,256-279 | | | NUMA node1 CPU: 24-47,280-303 +^-----+ | | | | +-------------------------------------+ After the patchset: +-------------------------------------+ | | | NUMA node0 CPU: 0-23,48-71 +------+ | NUMA node1 CPU: 24-47,72-95 | | | | | +-------------------------------------+ | Hot-plug +-------------------------------------+ + | | | | NUMA node0 CPU: 0-23,48-71 <------+ | NUMA node1 CPU: 24-47,72-95 | | NUMA node2 CPU: 96-143 +------+ | NUMA node3 CPU: 144-191 | | | | | +-------------------------------------+ | Hot-remove +-------------------------------------+ | | | | | NUMA node0 CPU: 0-23,48-71 | | | NUMA node1 CPU: 24-47,72-95 +^-----+ | | | | +-------------------------------------+ And I also test some cases in VMs with QEmu. And When I get more nodes, I will test the whole function. Thanks, Liyang. At 03/03/2017 04:02 PM, Dou Liyang wrote:
[Summary]: 1, Revert two commits 2, Fix the order of Logical CPU IDs 3, Move the validation of processor IDs to hot-plug time. The mapping of "cpuid <-> nodeid" is established at boot time via ACPI tables to keep associations of workqueues and other node related items consistent across cpu hotplug as following: Step 1. Make the "Logical CPU ID <-> Processor ID/UID" fixed Using MADT: We generate the logical CPU IDs by the Local APIC/x2APIC IDs orderly and get the mapping of Processor ID/UID <-> Local Apic ID directly in MADT. So, we get the mapping of *Processor ID/UID <-> Local Apic ID <-> Logical CPU ID* Step 2. Make the "Processor ID/UID <-> Node ID(_PXM)" fixed Using DSDT: The maaping of "Processor ID/UID <-> Node ID(_PXM)" is ready-made in each entities. we just use it directly. But, ACPI tables are unreliable and failures with that boot time mapping have been reported on machines where the ACPI table and the physical information which is retrieved at actual hotplug is inconsistent. Here has already two bugs we found: 1. Duplicated Processor IDs in DSDT. It has been fixed by commits: '8e089eaa1999 ("acpi: Provide mechanism to validate processors in the ACPI tables")' and 'fd74da217df7 ("acpi: Validate processor id when mapping the processor")' 2. The _PXM in DSDT is inconsistent with the one in MADT. It may cause the bug, which is shown in: https://lkml.org/lkml/2017/2/12/200 And one phenomenon is happened in some specific boxes: 1. The logical CPU IDs is discrete. Such as: Node2: 64-69, 72-77, 80-85, 88-93,... There may be more strange things happened in the futher. We shouldn't just only fix them everytime, we should solve this problem from the source to avoid such problems happened again and again. Find a simple and easy way: 1. Do the step 1 when the CPU flag is enabled 2. Do the step 2 at hot-plug time, not at boot time when we did some useless work. It also can make the mapping of "cpuid <-> nodeid" fixed and avoid excessive using of the ACPI tables. Change log: v2 -> v3: 1. rewirte the changelogs copy the changelogs Thomas Gleixner <tglx@xxxxxxxxxxxxx> rewrite for the patch 1,2,4,5. 2. s/duplicate_processor_id()/acpi_duplicate_processor_id(). by Thomas Gleixner <tglx@xxxxxxxxxxxxx>'s advice. 3. modify the error handle in acpi_processor_ids_walk() by Thomas Gleixner <tglx@xxxxxxxxxxxxx>'s advice. 4. add a new patch for restoring the order of CPU IDs v1 -> v2: 1. fix some comments. 2. add the verification of duplicate processor id. Dou Liyang (5): Revert"x86/acpi: Set persistent cpuid <-> nodeid mapping when booting" Revert"x86/acpi: Enable MADT APIs to return disabled apicids" x86/acpi: Restore the order of CPU IDs acpi/processor: Implement DEVICE operator for processor enumeration acpi/processor: Check for duplicate processor ids at hotplug time arch/x86/kernel/acpi/boot.c | 9 ++- arch/x86/kernel/apic/apic.c | 26 +++------ drivers/acpi/acpi_processor.c | 57 +++++++++++++----- drivers/acpi/bus.c | 1 - drivers/acpi/processor_core.c | 133 +++++++----------------------------------- include/linux/acpi.h | 5 +- 6 files changed, 79 insertions(+), 152 deletions(-)
-- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html