On Mon, Nov 11 2024 at 17:23, Peter Zijlstra wrote: > On Fri, Nov 08, 2024 at 08:49:31AM -0500, Len Brown wrote: >> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c >> index 766f092dab80..910cb2d72c13 100644 >> --- a/arch/x86/kernel/smpboot.c >> +++ b/arch/x86/kernel/smpboot.c >> @@ -1377,6 +1377,9 @@ void smp_kick_mwait_play_dead(void) >> for (i = 0; READ_ONCE(md->status) != newstate && i < 1000; i++) { >> /* Bring it out of mwait */ >> WRITE_ONCE(md->control, newstate); >> + /* If MONITOR unreliable, send IPI */ >> + if (boot_cpu_has_bug(X86_BUG_MONITOR)) >> + __apic_send_IPI(cpu, RESCHEDULE_VECTOR); >> udelay(5); >> } > > Going over that code again, mwait_play_dead() is doing __mwait(.exc=0) > with IRQs disabled. And the APIC is shut down. So it won't react on the IPI either. > So that IPI you're trying to send there won't do no nothing :-/ > > Now that comment there says MCE/NMI/SMI are still open (non-maskable > etc.) so perhaps prod it on the NMI vector? > > This does seem to suggest the above code path wasn't actually tested. I'm not sure whether that's just a suggestion :) > Perhaps mark your local machine with BUG_MONITOR, remove the md->control > WRITE_ONCE() and try kexec to test it? > > Thomas, any other thoughts? NMI should work. See exc_nmi(): if (arch_cpu_is_offline(smp_processor_id())) { if (microcode_nmi_handler_enabled()) microcode_offline_nmi_handler(); return; } Thanks, tglx