https://bugzilla.kernel.org/show_bug.cgi?id=217247 --- Comment #2 from Sean Christopherson (seanjc@xxxxxxxxxx) --- +tglx On Sat, Mar 25, 2023, bugzilla-daemon@xxxxxxxxxx wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=217247 > > Bug ID: 217247 > Summary: BUG: kernel NULL pointer dereference, address: > 000000000000000c / speculation_ctrl_update > Product: Virtualization > Version: unspecified > Kernel Version: 6.1.20 > Hardware: All > OS: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: kvm > Assignee: virtualization_kvm@xxxxxxxxxxxxxxxxxxxx > Reporter: hvtaifwkbgefbaei@xxxxxxxxx > Regression: No > > Created attachment 304023 > --> https://bugzilla.kernel.org/attachment.cgi?id=304023&action=edit > kernel config > > This is 6.1.20 with only ZFS 2.1.9 module added. > I booted kernel with acpi=off because this old Ryzen 1600X system is getting > unreliable (so only one CPU is online with acpi=off, and it has been reliable > before this splat). > > 2023-03-25T13:28:40,794781+02:00 BUG: kernel NULL pointer dereference, > address: > 000000000000000c > 2023-03-25T13:28:40,794786+02:00 #PF: supervisor read access in kernel mode > 2023-03-25T13:28:40,794788+02:00 #PF: error_code(0x0000) - not-present page > 2023-03-25T13:28:40,794790+02:00 PGD 0 P4D 0 > 2023-03-25T13:28:40,794793+02:00 Oops: 0000 [#1] PREEMPT SMP NOPTI > 2023-03-25T13:28:40,794795+02:00 CPU: 0 PID: 917598 Comm: qemu-kvm Tainted: P > W O 6.1.20+ #12 > 2023-03-25T13:28:40,794798+02:00 Hardware name: To Be Filled By O.E.M. To Be > Filled By O.E.M./X370 Taichi, BIOS P6.20 01/03/2020 > 2023-03-25T13:28:40,794800+02:00 RIP: 0010:do_raw_spin_lock+0x6/0xb0 This looks like amd_set_core_ssb_state() explodes when it tries to acquire ssb_state.shared_state.lock. Aha! With acpi=off, I assume __apic_intr_mode_select() will return APIC_VIRTUAL_WIRE_NO_CONFIG: /* Check MP table or ACPI MADT configuration */ if (!smp_found_config) { disable_ioapic_support(); if (!acpi_lapic) { pr_info("APIC: ACPI MADT or MP tables are not detected\n"); return APIC_VIRTUAL_WIRE_NO_CONFIG; } return APIC_VIRTUAL_WIRE; } Which will cause native_smp_prepare_cpus() to bail early and not run through speculative_store_bypass_ht_init(), leaving a NULL ssb_state.shared_state: switch (apic_intr_mode) { case APIC_PIC: case APIC_VIRTUAL_WIRE_NO_CONFIG: disable_smp(); return; case APIC_SYMMETRIC_IO_NO_ROUTING: disable_smp(); /* Setup local timer */ x86_init.timers.setup_percpu_clockev(); return; case APIC_VIRTUAL_WIRE: case APIC_SYMMETRIC_IO: break; } I believe this will remedy your problem. I don't see anything that will obviously break in native_smp_prepare_cpus() by continuing on with a "bad" APIC. Hopefully Thomas can weigh in on whether or not it's a sane change. --- arch/x86/kernel/smpboot.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 9013bb28255a..ff69f8e3c392 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1409,22 +1409,17 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus) case APIC_PIC: case APIC_VIRTUAL_WIRE_NO_CONFIG: disable_smp(); - return; + break; case APIC_SYMMETRIC_IO_NO_ROUTING: disable_smp(); - /* Setup local timer */ - x86_init.timers.setup_percpu_clockev(); - return; + fallthrough; case APIC_VIRTUAL_WIRE: case APIC_SYMMETRIC_IO: + x86_init.timers.setup_percpu_clockev(); + smp_get_logical_apicid(); break; } - /* Setup local timer */ - x86_init.timers.setup_percpu_clockev(); - - smp_get_logical_apicid(); - pr_info("CPU0: "); print_cpu_info(&cpu_data(0)); base-commit: b0d237087c674c43df76c1a0bc2737592f3038f4 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.