When an HA cluster software or administrator detects non-response of a host, they issue an NMI to the host to completely stop current works and take a crash dump. If the kernel has already panicked or is capturing a crash dump at that time, further NMI can cause a crash dump failure. To solve this issue, this patch set does two things: - Don't panic on NMI if the kernel has already panicked - Introduce "noextnmi" boot option which masks external NMI at the boot time (supported only for x86) V2: - Use atomic_cmpxchg() instead of current spin_trylock() to exclude concurrent accesses to panic() and crash_kexec() - Don't introduce no-lock version of panic() and crash_kexec() --- Hidehiro Kawai (3): x86/panic: Fix re-entrance problem due to panic on NMI kexec: Fix race between panic() and crash_kexec() called directly x86/apic: Introduce noextnmi boot option Documentation/kernel-parameters.txt | 4 ++++ arch/x86/kernel/apic/apic.c | 17 ++++++++++++++++- arch/x86/kernel/nmi.c | 15 +++++++++++---- include/linux/kernel.h | 1 + kernel/kexec.c | 20 ++++++++++++++++++++ kernel/panic.c | 13 ++++++++++--- 6 files changed, 62 insertions(+), 8 deletions(-) -- Hidehiro Kawai Hitachi, Ltd. Research & Development Group