From: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Sent: Friday, April 14, 2023 4:44 PM [snip] > > Conclusion > ---------- > > Adding the basic parallel bringup mechanism as provided by this series > makes a lot of sense. Improving particular issues as pointed out in the > analysis makes sense too. > > But trying to solve an application specific problem fully in the kernel > with tons of complexity, without exploring straight forward and simple > approaches first, does not make any sense at all. > > Thanks, > > tglx > > --- > Documentation/admin-guide/kernel-parameters.txt | 20 > Documentation/core-api/cpu_hotplug.rst | 13 > arch/Kconfig | 23 + > arch/arm/Kconfig | 1 > arch/arm/include/asm/smp.h | 2 > arch/arm/kernel/smp.c | 18 > arch/arm64/Kconfig | 1 > arch/arm64/include/asm/smp.h | 2 > arch/arm64/kernel/smp.c | 14 > arch/csky/Kconfig | 1 > arch/csky/include/asm/smp.h | 2 > arch/csky/kernel/smp.c | 8 > arch/mips/Kconfig | 1 > arch/mips/cavium-octeon/smp.c | 1 > arch/mips/include/asm/smp-ops.h | 1 > arch/mips/kernel/smp-bmips.c | 1 > arch/mips/kernel/smp-cps.c | 14 > arch/mips/kernel/smp.c | 8 > arch/mips/loongson64/smp.c | 1 > arch/parisc/Kconfig | 1 > arch/parisc/kernel/process.c | 4 > arch/parisc/kernel/smp.c | 7 > arch/riscv/Kconfig | 1 > arch/riscv/include/asm/smp.h | 2 > arch/riscv/kernel/cpu-hotplug.c | 14 > arch/x86/Kconfig | 45 -- > arch/x86/include/asm/apic.h | 5 > arch/x86/include/asm/cpu.h | 5 > arch/x86/include/asm/cpumask.h | 5 > arch/x86/include/asm/processor.h | 1 > arch/x86/include/asm/realmode.h | 3 > arch/x86/include/asm/sev-common.h | 3 > arch/x86/include/asm/smp.h | 26 - > arch/x86/include/asm/topology.h | 23 - > arch/x86/include/asm/tsc.h | 2 > arch/x86/kernel/acpi/sleep.c | 9 > arch/x86/kernel/apic/apic.c | 22 - > arch/x86/kernel/callthunks.c | 4 > arch/x86/kernel/cpu/amd.c | 2 > arch/x86/kernel/cpu/cacheinfo.c | 21 > arch/x86/kernel/cpu/common.c | 50 -- > arch/x86/kernel/cpu/topology.c | 3 > arch/x86/kernel/head_32.S | 14 > arch/x86/kernel/head_64.S | 121 +++++ > arch/x86/kernel/sev.c | 2 > arch/x86/kernel/smp.c | 3 > arch/x86/kernel/smpboot.c | 508 ++++++++---------------- > arch/x86/kernel/topology.c | 98 ---- > arch/x86/kernel/tsc.c | 20 > arch/x86/kernel/tsc_sync.c | 36 - > arch/x86/power/cpu.c | 37 - > arch/x86/realmode/init.c | 3 > arch/x86/realmode/rm/trampoline_64.S | 27 + > arch/x86/xen/enlighten_hvm.c | 11 > arch/x86/xen/smp_hvm.c | 16 > arch/x86/xen/smp_pv.c | 56 +- > drivers/acpi/processor_idle.c | 4 > include/linux/cpu.h | 4 > include/linux/cpuhotplug.h | 17 > kernel/cpu.c | 397 +++++++++++++++++- > kernel/smp.c | 2 > kernel/smpboot.c | 163 ------- > 62 files changed, 953 insertions(+), 976 deletions(-) > I smoke-tested several Linux guest configurations running on Hyper-V, using the "kernel/git/tglx/devel.git hotplug" tree as updated on April 26th. No functional issues, but encountered one cosmetic issue (details below). Configurations tested: * 16 vCPUs and 32 vCPUs * 1 NUMA node and 2 NUMA nodes * Parallel bring-up enabled and disabled via kernel boot line * "Normal" VMs and SEV-SNP VMs running with a paravisor on Hyper-V. This config can use parallel bring-up because most of the SNP-ness is hidden in the paravisor. I was glad to see this work properly. There's not much difference in performance with and without parallel bring-up on the 32 vCPU VM. Without parallel, the time is about 26 milliseconds. With parallel, it's about 24 ms. So bring-up is already fast in the virtual environment. The cosmetic issue is in the dmesg log, and arises because Hyper-V enumerates SMT CPUs differently from many other environments. In a Hyper-V guest, the SMT threads in a core are numbered as <even, odd> pairs. Guest CPUs #0 & #1 are SMT threads in core, as are #2 & #3, etc. With parallel bring-up, here's the dmesg output: [ 0.444345] smp: Bringing up secondary CPUs ... [ 0.445139] .... node #0, CPUs: #2 #4 #6 #8 #10 #12 #14 #16 #18 #20 #22 #24 #26 #28 #30 [ 0.454112] x86: Booting SMP configuration: [ 0.456035] #1 #3 #5 #7 #9 #11 #13 #15 #17 #19 #21 #23 #25 #27 #29 #31 [ 0.466120] smp: Brought up 1 node, 32 CPUs [ 0.467036] smpboot: Max logical packages: 1 [ 0.468035] smpboot: Total of 32 processors activated (153240.06 BogoMIPS) The function announce_cpu() is specifically testing for CPU #1 to output the "Booting SMP configuration" message. In a Hyper-V guest, CPU #1 is the second SMT thread in a core, so it isn't started until all the even-numbered CPUs are started. I don't know if this cosmetic issue is worth fixing, but I thought I'd point it out. In any case, Tested-by: Michael Kelley <mikelley@xxxxxxxxxxxxx>