(2010/12/23 8:35), Seiji Aguchi wrote: > Hi, > > [Purpose] > Kexec may trigger additional hardware errors and multiply the damage > if it works after MCE occurred because there are some hardware-related > operations in kexec as follows. > - Sending NMI to cpus > - Initializing hardware during boot process of second kernel. > - Accessing to memory and dumping it to disks. > > So, I propose adding a new option controlling kexec behaviour when MCE > occurred. > This patch prevents unnecessary hardware errors and avoid expanding > the damage. > > [Patch Description] > I added a sysctl option ,kernel.kexec_on_mce, controlling kexec behaviour > when MCE occurred. > > - Permission > - 0644 > - Value(default is "1") > - non-zero: Kexec is enabled regardless of MCE. > - 0: Kexec is disabled when MCE occurred. > > Matrix of kernel.kexec_on_mce value, MCE and kexec behaviour > > -------------------------------------------------- > kernel.kexec_on_mce| MCE | kexec behaviour > -------------------------------------------------- > non-zero | occurred | enabled > ------------------------------- > | not occurred | enabled > -------------------------------------------------- > 0 | occurred | disabled > |------------------------------ > | not occurred | enabled > -------------------------------------------------- > > Any comments and suggestions are welcome. This reminds me of a quite similar patch that I've made a long time ago but haven't posted. Following is what I found still in a branch of my private git tree. I guess it cannot be applied without rebase, but I think the description of my patch could give you some different point of view etc. Feel free to use this debris to improve yours. Thanks, H.Seto <*__NOTE_THIS_PATCH_IS_NOT_READY_TO_APPLY__*> ===== From: Hidetoshi Seto <seto.hidetoshi@xxxxxxxxxxxxxx> Date: Fri, 10 Jul 2009 15:55:42 +0900 Subject: [PATCH] kdump, sysctl: kdump_on_safe This patch adds a sysctl kdump_on_safe, to limit kdump to run only on safe situation. Quote from document in this patch: > kdump_on_safe: > > When the system experiences panic, kdump will be triggered if > crash kernel is configured. However the kdump might fail if > the panic was caused by fatal error, such as hardware error > reported by machine check exception. It should be rare case, > but in the worst case, it will result in data corruption and/or > fatal damage on the hardware. > > If this flag is 1, it prevents kdump from running on such > unstable system situation. Default is 0. This will be a possible option if your hardware can provide good error report (in SEL etc.) and/or kernel can provide other data enough for error investigation (console log, mcelog on x86 etc.), and you'd like to reduce down-time by skipping kdump on such situation. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@xxxxxxxxxxxxxx> --- Documentation/sysctl/kernel.txt | 15 +++++++++++++++ arch/x86/kernel/cpu/mcheck/mce.c | 3 +++ include/linux/kexec.h | 3 +++ kernel/kexec.c | 8 ++++++++ kernel/sysctl.c | 13 +++++++++++++ 5 files changed, 42 insertions(+), 0 deletions(-) diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index 3894eaa..9d66ab9 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -33,6 +33,7 @@ show up in /proc/sys/kernel: - hotplug - java-appletviewer [ binfmt_java, obsolete ] - java-interpreter [ binfmt_java, obsolete ] +- kdump_on_safe [ kexec ] - kstack_depth_to_print [ X86 only ] - l2cr [ PPC only ] - modprobe ==> Documentation/debugging-modules.txt @@ -247,6 +248,20 @@ This flag controls the L2 cache of G3 processor boards. If ============================================================== +kdump_on_safe: + +When the system experiences panic, kdump will be triggered if +crash kernel is configured. However the kdump might fail if +the panic was caused by fatal error, such as hardware error +reported by machine check exception. It should be rare case, +but in the worst case, it will result in data corruption and/or +fatal damage on the hardware. + +If this flag is 1, it prevents kdump from running on such +unstable system situation. Default is 0. + +============================================================== + kstack_depth_to_print: (X86 only) Controls the number of words to print when dumping the raw diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index 3e2ab18..c93bb38 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -23,6 +23,7 @@ #include <linux/sysdev.h> #include <linux/delay.h> #include <linux/ctype.h> +#include <linux/kexec.h> #include <linux/sched.h> #include <linux/sysfs.h> #include <linux/types.h> @@ -291,6 +292,8 @@ static void mce_panic(char *msg, struct mce *final, char *exp) int cpu; if (!fake_panic) { + set_kdump_might_fail(); + /* * Make sure only one CPU runs in machine check panic */ diff --git a/include/linux/kexec.h b/include/linux/kexec.h index 03e8e8d..41e9ab0 100644 --- a/include/linux/kexec.h +++ b/include/linux/kexec.h @@ -209,10 +209,13 @@ int __init parse_crashkernel(char *cmdline, unsigned long long system_ram, int crash_shrink_memory(unsigned long new_size); size_t crash_get_memory_size(void); +extern int kdump_might_fail; +static inline void set_kdump_might_fail(void) { kdump_might_fail = 1; } #else /* !CONFIG_KEXEC */ struct pt_regs; struct task_struct; static inline void crash_kexec(struct pt_regs *regs) { } static inline int kexec_should_crash(struct task_struct *p) { return 0; } +static inline void set_kdump_might_fail(void) { } #endif /* CONFIG_KEXEC */ #endif /* LINUX_KEXEC_H */ diff --git a/kernel/kexec.c b/kernel/kexec.c index 87ebe8a..182c2f3 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -40,6 +40,9 @@ #include <asm/system.h> #include <asm/sections.h> +int kdump_on_safe; +int kdump_might_fail; + /* Per cpu memory for storing cpu states in case of system crash. */ note_buf_t __percpu *crash_notes; @@ -1064,6 +1067,11 @@ asmlinkage long compat_sys_kexec_load(unsigned long entry, void crash_kexec(struct pt_regs *regs) { + if (kdump_on_safe && kdump_might_fail) { + printk(KERN_EMERG "kexec cancelled due to unstable system.\n"); + return; + } + /* Take the kexec_mutex here to prevent sys_kexec_load * running on one cpu from replacing the crash kernel * we are using after a panic on a different cpu. diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 8686b0f..8564e5c 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -156,6 +156,10 @@ extern int unaligned_dump_stack; extern struct ratelimit_state printk_ratelimit_state; +#ifdef CONFIG_KEXEC +extern int kdump_on_safe; +#endif + #ifdef CONFIG_PROC_SYSCTL static int proc_do_cad_pid(struct ctl_table *table, int write, void __user *buffer, size_t *lenp, loff_t *ppos); @@ -926,6 +930,15 @@ static struct ctl_table kern_table[] = { .proc_handler = proc_dointvec, }, #endif +#ifdef CONFIG_KEXEC + { + .procname = "kdump_on_safe", + .data = &kdump_on_safe, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec, + }, +#endif /* * NOTE: do not add new entries to this table unless you have read * Documentation/sysctl/ctl_unnumbered.txt -- 1.7.3.2 </*__NOTE_THIS_PATCH_IS_NOT_READY_TO_APPLY__*> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>