ap_start64 serves as the 64-bit entrypoint for APs during bringup. Since apic.c:apic_ops is not guarded against concurrent accesses, there exists a race between reset_apic(), enable_apic() and enable_x2apic() which results in APs crashing or getting blocked in various scenarios (eg, enabling x2apic while disabling xapic). The bug is rare with vcpu count < 32, but becomes easier to reproduce with vcpus > 64 and the following thunk: lib/x86/apic.c: void enable_apic(void) { - printf("enabling apic\n"); xapic_write(APIC_SPIV, 0x1ff); } Serialize the bringup code in ap_start64 to fix this. Signed-off-by: Varad Gautam <varad.gautam@xxxxxxxx> --- x86/cstart64.S | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/x86/cstart64.S b/x86/cstart64.S index 7272452..238cebf 100644 --- a/x86/cstart64.S +++ b/x86/cstart64.S @@ -45,6 +45,9 @@ mb_boot_info: .quad 0 pt_root: .quad ptl4 +ap_lock: + .long 0 + .section .init .code32 @@ -188,12 +191,18 @@ save_id: retq ap_start64: +.retry: + xor %eax, %eax + lock btsl %eax, ap_lock + jc .retry call reset_apic load_tss call enable_apic call save_id call enable_x2apic sti + xor %eax, %eax + lock btr %eax, ap_lock nop lock incw cpu_online_count -- 2.35.1