(2015/03/23 16:19), Ingo Molnar wrote: > > * Baoquan He <bhe at redhat.com> wrote: > >> CC more people ... >> >> On 03/07/15 at 01:31am, "Hatayama, Daisuke/?? ??" wrote: >>> The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced >>> "crash_kexec_post_notifiers" kernel boot option, which toggles >>> wheather panic() calls crash_kexec() before panic_notifiers and dump >>> kmsg or after. >>> >>> The problem is that the commit overlooks panic_on_oops kernel boot >>> option. If it is enabled, crash_kexec() is called directly without >>> going through panic() in oops path. >>> >>> To fix this issue, this patch adds a check to >>> "crash_kexec_post_notifiers" in the condition of kexec_should_crash(). >>> >>> Also, put a comment in kexec_should_crash() to explain not obvious >>> things on this patch. >>> >>> Signed-off-by: HATAYAMA Daisuke <d.hatayama at jp.fujitsu.com> >>> Acked-by: Baoquan He <bhe at redhat.com> >>> Tested-by: Hidehiro Kawai <hidehiro.kawai.ez at hitachi.com> >>> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt at hitachi.com> >>> --- >>> include/linux/kernel.h | 3 +++ >>> kernel/kexec.c | 11 +++++++++++ >>> kernel/panic.c | 2 +- >>> 3 files changed, 15 insertions(+), 1 deletion(-) > > This is hack upon hack, but why was this crap merged in the first > place? > > I see two problems just by cursory review: > > 1) > > Firstly, the real bug in: > > f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers") > > Was that crash_kexec() was called unconditionally after notifiers were > called, which should be fixed via the simple patch below (untested). > Looks much simpler than your fix. No, Daisuke's patch is not for that case. Since the kdump has a special hook in kernel oops, when both of panic_on_oops and crash_kernel are set, panic() is never called. Please see oops_end at arch/x86/kernel/dumpstack.c ---- void oops_end(unsigned long flags, struct pt_regs *regs, int signr) { if (regs && kexec_should_crash(current)) crash_kexec(regs); ---- Of course crash_kexec() never return except failing kexec unexpectedly. Thus, kexec_should_crash should returns 0 if crash_kexec_post_notifiers is set. (Semantically, it is a bit strange that panic_on_oops doesn't call panic(), but that is another topic.) However, your patch is also needed since the first crash_kexec() can fail in panic() when crash_kexec_post_notifiers is not set. In that case, kernel tries to call notifiers and call the 2nd crash_kexec() again. Actually the 2nd one is useless. So, here is my reviewed-by. Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt at hitachi.com> I'll be reply the latter part in other mail. Thank you, > > 2) > > Secondly, and more importantly, the whole premise of commit > f06e5153f4ae is broken IMHO: > > "This can help rare situations where kdump fails because of unstable > crashed kernel or hardware failure (memory corruption on critical > data/code)" > > wtf? > > If the kernel crashed due to a kernel crash, then the kernel booting > up in whatever hardware state should be able to do a clean bootup. The > fix for those 'rare situations' should be to fix the real bug (for > example by making hardware driver init (or deinit) sequences more > robust), not to paper it over by ordering around crash-time sequences > ... > > If it crashed due to some hardware failure, there's literally an > infinite amount of failure modes that may or may not be impacted by > kexec crash-time handling ordering. We don't want to put a zillion > such flags into the kernel proper just to allow the perturbation of > the kernel. > > Thanks, > > Ingo > > diff --git a/kernel/panic.c b/kernel/panic.c > index 8136ad76e5fd..774614f72cbd 100644 > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -142,7 +142,8 @@ void panic(const char *fmt, ...) > * Note: since some panic_notifiers can make crashed kernel > * more unstable, it can increase risks of the kdump failure too. > */ > - crash_kexec(NULL); > + if (crash_kexec_post_notifiers) > + crash_kexec(NULL); > > bust_spinlocks(0); > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo at vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Research Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu.pt at hitachi.com