On Thu, 2021-12-16 at 16:52 -0600, Tom Lendacky wrote: > On baremetal, I haven't seen an issue. This only seems to have a problem > with Qemu/KVM. > > With 191f08997577 I could boot without issues with and without the > no_parallel_bringup. Only after I applied e78fa57dd642 did the failure happen. > > With e78fa57dd642 I could boot 64 vCPUs pretty consistently, but when I > jumped to 128 vCPUs it failed again. When I moved the series to > df9726cb7178, then 64 vCPUs also failed pretty consistently. > > Strange thing is it is random. Sometimes (rarely) it works on the first > boot and then sometimes it doesn't, at which point it will reset and > reboot 3 or 4 times and then make it past the failure and fully boot. Hm, some of that is just artifacts of timing, I'm sure. But now I'm staring at the way that early_setup_idt() can run in parallel on all CPUs, rewriting bringup_idt_descr and loading it. To start with, let's try unlocking the trampoline_lock much later, after cpu_init_exception_handling() has loaded the real IDT. I think we can probably make secondaries load the real IDT early and never use bringup_idt_descr at all, can't we? But let's see if this makes it go away, to start with... diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 0cd6373bc3f2..2307f7575ab4 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -59,7 +59,7 @@ #include <asm/cpu_device_id.h> #include <asm/uv/uv.h> #include <asm/sigframe.h> - +#include <asm/realmode.h> #include "cpu.h" u32 elf_hwcap2 __read_mostly; @@ -2060,6 +2060,7 @@ void cpu_init_secondary(void) * on this CPU in cpu_init_exception_handling(). */ cpu_init_exception_handling(); + clear_bit(0, (unsigned long *)trampoline_lock); cpu_init(); } #endif diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index 3e4c3c416bce..db01b56574cd 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -273,14 +273,6 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL) */ movq initial_stack(%rip), %rsp - /* Drop the realmode protection. For the boot CPU the pointer is NULL! */ - movq trampoline_lock(%rip), %rax - testq %rax, %rax - jz .Lsetup_idt - lock - btrl $0, (%rax) - -.Lsetup_idt: /* Setup and Load IDT */ pushq %rsi call early_setup_idt
Attachment:
smime.p7s
Description: S/MIME cryptographic signature