* Vivek Goyal <vgoyal at redhat.com> wrote: > > Yet the actual bug is in that commit, 'crash_kexec_post_notifiers' > > was clearly not a no-op in the default case, against expectations. > > Hi Ingo, > > I did a quick test and in default case crash_kexec() runs before > panic notifiers. So it does look like crash_kexec_post_notifiers is > a no-op in default case. > > What am I missing. Well, look at f06e5153f4ae: diff --git a/kernel/panic.c b/kernel/panic.c index d02fa9fef46a..62e16cef9cc2 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -32,6 +32,7 @@ static unsigned long tainted_mask; static int pause_on_oops; static int pause_on_oops_flag; static DEFINE_SPINLOCK(pause_on_oops_lock); +static bool crash_kexec_post_notifiers; int panic_timeout = CONFIG_PANIC_TIMEOUT; EXPORT_SYMBOL_GPL(panic_timeout); @@ -112,9 +113,11 @@ void panic(const char *fmt, ...) /* * If we have crashed and we have a crash kernel loaded let it handle * everything else. - * Do we want to call this before we try to display a message? + * If we want to run this after calling panic_notifiers, pass + * the "crash_kexec_post_notifiers" option to the kernel. */ - crash_kexec(NULL); + if (!crash_kexec_post_notifiers) + crash_kexec(NULL); /* * Note smp_send_stop is the usual smp shutdown function, which @@ -131,6 +134,15 @@ void panic(const char *fmt, ...) kmsg_dump(KMSG_DUMP_PANIC); + /* + * If you doubt kdump always works fine in any situation, + * "crash_kexec_post_notifiers" offers you a chance to run + * panic_notifiers and dumping kmsg before kdump. + * Note: since some panic_notifiers can make crashed kernel + * more unstable, it can increase risks of the kdump failure too. + */ + crash_kexec(NULL); + bust_spinlocks(0); if (!panic_blink) Without knowing what crash_kexec() does, the patch looks buggy: it should preserve the old behavior by default, yet it will now execute a second crash_kexec() after the kmsg_dump() line. So the invariant change would have been to do: - crash_kexec(NULL); + if (!crash_kexec_post_notifiers) + crash_kexec(NULL); ... + if (crash_kexec_post_notifiers) + crash_kexec(NULL); Which in the !crash_kexec_post_notifiers flag case reduces to: crash_kexec(); ... /* NOP */ I.e. to exactly what the kernel was doing without the patch originally. Which is what my patch does. Nothing more, nothing less. There might be other bugs with the patch, I didn't consider that. A revert would be fine as well. Thanks, Ingo