+ panic-add-sysctl-to-dump-all-cpus-backtraces-on-oops-event.patch added to -mm tree

akpm@xxxxxxxxxxxxxxxxxxxx · Fri, 17 Apr 2020 17:47:02 -0700

The patch titled
     Subject: panic: add sysctl to dump all CPUs backtraces on oops event
has been added to the -mm tree.  Its filename is
     panic-add-sysctl-to-dump-all-cpus-backtraces-on-oops-event.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/panic-add-sysctl-to-dump-all-cpus-backtraces-on-oops-event.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/panic-add-sysctl-to-dump-all-cpus-backtraces-on-oops-event.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: "Guilherme G. Piccoli" <gpiccoli@xxxxxxxxxxxxx>
Subject: panic: add sysctl to dump all CPUs backtraces on oops event

Usually when the kernel reaches an oops condition, it's a point of no
return; in case not enough debug information is available in the kernel
splat, one of the last resorts would be to collect a kernel crash dump and
analyze it.  The problem with this approach is that in order to collect
the dump, a panic is required (to kexec-load the crash kernel).  When in
an environment of multiple virtual machines, users may prefer to try
living with the oops, at least until being able to properly shutdown their
VMs / finish their important tasks.

This patch implements a way to collect a bit more debug details when an
oops event is reached, by printing all the CPUs backtraces through the
usage of NMIs (on architectures that support that).  The sysctl added (and
documented) here was called "oops_all_cpu_backtrace", and when set will
(as the name suggests) dump all CPUs backtraces.

Far from ideal, this may be the last option though for users that for some
reason cannot panic on oops.  Most of times oopses are clear enough to
indicate the kernel portion that must be investigated, but in virtual
environments it's possible to observe hypervisor/KVM issues that could
lead to oopses shown in other guests CPUs (like virtual APIC crashes). 
This patch hence aims to help debug such complex issues without resorting
to kdump.

Link: http://lkml.kernel.org/r/20200327224116.21030-1-gpiccoli@xxxxxxxxxxxxx
Signed-off-by: Guilherme G. Piccoli <gpiccoli@xxxxxxxxxxxxx>
Reviewed-by: Kees Cook <keescook@xxxxxxxxxxxx>
Cc: Luis Chamberlain <mcgrof@xxxxxxxxxx>
Cc: Iurii Zaikin <yzaikin@xxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Vlastimil Babka <vbabka@xxxxxxx>
Cc: Randy Dunlap <rdunlap@xxxxxxxxxxxxx>
Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 Documentation/admin-guide/sysctl/kernel.rst |   16 ++++++++++++++++
 include/linux/kernel.h                      |    6 ++++++
 kernel/panic.c                              |   11 +++++++++++
 kernel/sysctl.c                             |   11 +++++++++++
 4 files changed, 44 insertions(+)

--- a/Documentation/admin-guide/sysctl/kernel.rst~panic-add-sysctl-to-dump-all-cpus-backtraces-on-oops-event
+++ a/Documentation/admin-guide/sysctl/kernel.rst
@@ -555,6 +555,22 @@ rate for each task.
 scanned for a given scan.
 
 
+oops_all_cpu_backtrace:
+================
+
+If this option is set, the kernel will send an NMI to all CPUs to dump
+their backtraces when an oops event occurs. It should be used as a last
+resort in case a panic cannot be triggered (to protect VMs running, for
+example) or kdump can't be collected. This file shows up if CONFIG_SMP
+is enabled.
+
+0: Won't show all CPUs backtraces when an oops is detected.
+This is the default behavior.
+
+1: Will non-maskably interrupt all CPUs and dump their backtraces when
+an oops event is detected.
+
+
 osrelease, ostype & version
 ===========================
 
--- a/include/linux/kernel.h~panic-add-sysctl-to-dump-all-cpus-backtraces-on-oops-event
+++ a/include/linux/kernel.h
@@ -520,6 +520,12 @@ static inline u32 int_sqrt64(u64 x)
 }
 #endif
 
+#ifdef CONFIG_SMP
+extern unsigned int sysctl_oops_all_cpu_backtrace;
+#else
+#define sysctl_oops_all_cpu_backtrace 0
+#endif /* CONFIG_SMP */
+
 extern void bust_spinlocks(int yes);
 extern int oops_in_progress;		/* If set, an oops, panic(), BUG() or die() is in progress */
 extern int panic_timeout;
--- a/kernel/panic.c~panic-add-sysctl-to-dump-all-cpus-backtraces-on-oops-event
+++ a/kernel/panic.c
@@ -36,6 +36,14 @@
 #define PANIC_TIMER_STEP 100
 #define PANIC_BLINK_SPD 18
 
+#ifdef CONFIG_SMP
+/*
+ * Should we dump all CPUs backtraces in an oops event?
+ * Defaults to 0, can be changed via sysctl.
+ */
+unsigned int __read_mostly sysctl_oops_all_cpu_backtrace;
+#endif /* CONFIG_SMP */
+
 int panic_on_oops = CONFIG_PANIC_ON_OOPS_VALUE;
 static unsigned long tainted_mask =
 	IS_ENABLED(CONFIG_GCC_PLUGIN_RANDSTRUCT) ? (1 << TAINT_RANDSTRUCT) : 0;
@@ -515,6 +523,9 @@ void oops_enter(void)
 	/* can't trust the integrity of the kernel anymore: */
 	debug_locks_off();
 	do_oops_enter_exit();
+
+	if (sysctl_oops_all_cpu_backtrace)
+		trigger_all_cpu_backtrace();
 }
 
 /*
--- a/kernel/sysctl.c~panic-add-sysctl-to-dump-all-cpus-backtraces-on-oops-event
+++ a/kernel/sysctl.c
@@ -801,6 +801,17 @@ static struct ctl_table kern_table[] = {
 		.proc_handler	= proc_dointvec,
 	},
 #endif
+#ifdef CONFIG_SMP
+	{
+		.procname	= "oops_all_cpu_backtrace",
+		.data		= &sysctl_oops_all_cpu_backtrace,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= SYSCTL_ZERO,
+		.extra2		= SYSCTL_ONE,
+	},
+#endif /* CONFIG_SMP */
 	{
 		.procname	= "pid_max",
 		.data		= &pid_max,
_

Patches currently in -mm which might be from gpiccoli@xxxxxxxxxxxxx are

panic-add-sysctl-to-dump-all-cpus-backtraces-on-oops-event.patch