On 2016/08/15/ at 19:22, Hidehiro Kawai wrote: > Hi Dave, > > Thank you for the review. > >> From: Dave Young [mailto:dyoung@xxxxxxxxxx] >> Sent: Friday, August 12, 2016 12:17 PM >> >> Thanks for the update. >> On 08/10/16 at 05:09pm, Hidehiro Kawai wrote: >>> Daniel Walker reported problems which happens when >>> crash_kexec_post_notifiers kernel option is enabled >>> (https://lkml.org/lkml/2015/6/24/44). >>> >>> In that case, smp_send_stop() is called before entering kdump routines >>> which assume other CPUs are still online. As the result, for x86, >>> kdump routines fail to save other CPUs' registers and disable >>> virtualization extensions. >> Seems you simplified the changelog, but I think a little more details >> will be helpful to understand the patch. You know sometimes lkml.org >> does not work well. > So, I'll try another archives when I post patch set next time. Hi Hidehiro Kawai, What's the status of this patch set, are you going to send an updated version? Regards, Xunlei >>> To fix this problem, call a new kdump friendly function, >>> crash_smp_send_stop(), instead of the smp_send_stop() when >>> crash_kexec_post_notifiers is enabled. crash_smp_send_stop() is a >>> weak function, and it just call smp_send_stop(). Architecture >>> codes should override it so that kdump can work appropriately. >>> This patch only provides x86-specific version. >>> >>> For Xen's PV kernel, just keep the current behavior. >> Could you explain a bit about above Xen PV kernel behavior? >> >> BTW, this version looks better, I think I'm fine with this version >> besides of the questions about changelog. > As for Dom0 kernel, it doesn't use crash_kexec routines, and > it relies on panic notifier chain. At the end of the chain, > xen_panic_event is called, and it issues a hypercall which > requests Hypervisor to execute kdump. This means whether > crash_kexec_panic_notifiers is set or not, panic notifiers > are called after smp_send_stop. Even if we save registers > in Dom0 kernel, they seem to be ignored (Hypervisor is responsible > for that). This is why I kept the current behavior for Xen. > > For PV DomU kernel, kdump is not supported. For PV HVM > DomU, I'm not sure what will happen on panic because I > couldn't boot PV HVM DomU and test it. But I think it will > work similarly to baremetal kernels with extra cleanups > for Hypervisor. > > Best regards, > > Hidehiro Kawai > >>> Changes in V4: >>> - Keep to use smp_send_stop if crash_kexec_post_notifiers is not set >>> - Rename panic_smp_send_stop to crash_smp_send_stop >>> - Don't change the behavior for Xen's PV kernel >>> >>> Changes in V3: >>> - Revise comments, description, and symbol names >>> >>> Changes in V2: >>> - Replace smp_send_stop() call with crash_kexec version which >>> saves cpu states and cleans up VMX/SVM >>> - Drop a fix for Problem 1 at this moment >>> >>> Reported-by: Daniel Walker <dwalker@xxxxxxxxxx> >>> Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option) >>> Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@xxxxxxxxxxx> >>> Cc: Dave Young <dyoung@xxxxxxxxxx> >>> Cc: Baoquan He <bhe@xxxxxxxxxx> >>> Cc: Vivek Goyal <vgoyal@xxxxxxxxxx> >>> Cc: Eric Biederman <ebiederm@xxxxxxxxxxxx> >>> Cc: Masami Hiramatsu <mhiramat@xxxxxxxxxx> >>> Cc: Daniel Walker <dwalker@xxxxxxxxxx> >>> Cc: Xunlei Pang <xpang@xxxxxxxxxx> >>> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> >>> Cc: Ingo Molnar <mingo@xxxxxxxxxx> >>> Cc: "H. Peter Anvin" <hpa@xxxxxxxxx> >>> Cc: Borislav Petkov <bp@xxxxxxx> >>> Cc: David Vrabel <david.vrabel@xxxxxxxxxx> >>> Cc: Toshi Kani <toshi.kani@xxxxxxx> >>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> >>> --- >>> arch/x86/include/asm/kexec.h | 1 + >>> arch/x86/include/asm/smp.h | 1 + >>> arch/x86/kernel/crash.c | 22 +++++++++++++++++--- >>> arch/x86/kernel/smp.c | 5 ++++ >>> kernel/panic.c | 47 ++++++++++++++++++++++++++++++++++++------ >>> 5 files changed, 66 insertions(+), 10 deletions(-) >>> >>> diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h >>> index d2434c1..282630e 100644 >>> --- a/arch/x86/include/asm/kexec.h >>> +++ b/arch/x86/include/asm/kexec.h >>> @@ -210,6 +210,7 @@ struct kexec_entry64_regs { >>> >>> typedef void crash_vmclear_fn(void); >>> extern crash_vmclear_fn __rcu *crash_vmclear_loaded_vmcss; >>> +extern void kdump_nmi_shootdown_cpus(void); >>> >>> #endif /* __ASSEMBLY__ */ >>> >>> diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h >>> index ebd0c16..f70989c 100644 >>> --- a/arch/x86/include/asm/smp.h >>> +++ b/arch/x86/include/asm/smp.h >>> @@ -50,6 +50,7 @@ struct smp_ops { >>> void (*smp_cpus_done)(unsigned max_cpus); >>> >>> void (*stop_other_cpus)(int wait); >>> + void (*crash_stop_other_cpus)(void); >>> void (*smp_send_reschedule)(int cpu); >>> >>> int (*cpu_up)(unsigned cpu, struct task_struct *tidle); >>> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c >>> index 9616cf7..650830e 100644 >>> --- a/arch/x86/kernel/crash.c >>> +++ b/arch/x86/kernel/crash.c >>> @@ -133,15 +133,31 @@ static void kdump_nmi_callback(int cpu, struct pt_regs *regs) >>> disable_local_APIC(); >>> } >>> >>> -static void kdump_nmi_shootdown_cpus(void) >>> +void kdump_nmi_shootdown_cpus(void) >>> { >>> nmi_shootdown_cpus(kdump_nmi_callback); >>> >>> disable_local_APIC(); >>> } >>> >>> +/* Override the weak function in kernel/panic.c */ >>> +void crash_smp_send_stop(void) >>> +{ >>> + static int cpus_stopped; >>> + >>> + if (cpus_stopped) >>> + return; >>> + >>> + if (smp_ops.crash_stop_other_cpus) >>> + smp_ops.crash_stop_other_cpus(); >>> + else >>> + smp_send_stop(); >>> + >>> + cpus_stopped = 1; >>> +} >>> + >>> #else >>> -static void kdump_nmi_shootdown_cpus(void) >>> +void crash_smp_send_stop(void) >>> { >>> /* There are no cpus to shootdown */ >>> } >>> @@ -160,7 +176,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs) >>> /* The kernel is broken so disable interrupts */ >>> local_irq_disable(); >>> >>> - kdump_nmi_shootdown_cpus(); >>> + crash_smp_send_stop(); >>> >>> /* >>> * VMCLEAR VMCSs loaded on this cpu if needed. >>> diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c >>> index 658777c..68f8cc2 100644 >>> --- a/arch/x86/kernel/smp.c >>> +++ b/arch/x86/kernel/smp.c >>> @@ -32,6 +32,8 @@ >>> #include <asm/nmi.h> >>> #include <asm/mce.h> >>> #include <asm/trace/irq_vectors.h> >>> +#include <asm/kexec.h> >>> + >>> /* >>> * Some notes on x86 processor bugs affecting SMP operation: >>> * >>> @@ -342,6 +344,9 @@ struct smp_ops smp_ops = { >>> .smp_cpus_done = native_smp_cpus_done, >>> >>> .stop_other_cpus = native_stop_other_cpus, >>> +#if defined(CONFIG_KEXEC_CORE) >>> + .crash_stop_other_cpus = kdump_nmi_shootdown_cpus, >>> +#endif >>> .smp_send_reschedule = native_smp_send_reschedule, >>> >>> .cpu_up = native_cpu_up, >>> diff --git a/kernel/panic.c b/kernel/panic.c >>> index ca8cea1..e6480e2 100644 >>> --- a/kernel/panic.c >>> +++ b/kernel/panic.c >>> @@ -71,6 +71,32 @@ void __weak nmi_panic_self_stop(struct pt_regs *regs) >>> panic_smp_self_stop(); >>> } >>> >>> +/* >>> + * Stop other CPUs in panic. Architecture dependent code may override this >>> + * with more suitable version. For example, if the architecture supports >>> + * crash dump, it should save registers of each stopped CPU and disable >>> + * per-CPU features such as virtualization extensions. >>> + */ >>> +void __weak crash_smp_send_stop(void) >>> +{ >>> + static int cpus_stopped; >>> + >>> + /* >>> + * This function can be called twice in panic path, but obviously >>> + * we execute this only once. >>> + */ >>> + if (cpus_stopped) >>> + return; >>> + >>> + /* >>> + * Note smp_send_stop is the usual smp shutdown function, which >>> + * unfortunately means it may not be hardened to work in a panic >>> + * situation. >>> + */ >>> + smp_send_stop(); >>> + cpus_stopped = 1; >>> +} >>> + >>> atomic_t panic_cpu = ATOMIC_INIT(PANIC_CPU_INVALID); >>> >>> /* >>> @@ -164,14 +190,21 @@ void panic(const char *fmt, ...) >>> if (!_crash_kexec_post_notifiers) { >>> printk_nmi_flush_on_panic(); >>> __crash_kexec(NULL); >>> - } >>> >>> - /* >>> - * Note smp_send_stop is the usual smp shutdown function, which >>> - * unfortunately means it may not be hardened to work in a panic >>> - * situation. >>> - */ >>> - smp_send_stop(); >>> + /* >>> + * Note smp_send_stop is the usual smp shutdown function, which >>> + * unfortunately means it may not be hardened to work in a >>> + * panic situation. >>> + */ >>> + smp_send_stop(); >>> + } else { >>> + /* >>> + * If we want to do crash dump after notifier calls and >>> + * kmsg_dump, we will need architecture dependent extra >>> + * works in addition to stopping other CPUs. >>> + */ >>> + crash_smp_send_stop(); >>> + } >>> >>> /* >>> * Run any panic handlers, including those that might need to >>> >>>