[PATCH V4] notifier/panic: Introduce panic_notifier_filter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The kernel notifier infrastructure allows function callbacks to be
added in multiple lists, which are then called in the proper time,
like in a reboot or panic event. The panic_notifier_list specifically
contains the callbacks that are executed during a panic event. As any
other notifier list, the panic one has no filtering and all functions
previously registered are executed.

The kdump infrastructure, on the other hand, enables users to set
a crash kernel that is kexec'ed in a panic event, and vmcore/logs
are collected in such crash kernel. When kdump is set, by default
the panic notifiers are ignored - the kexec jumps to the crash kernel
before the list is checked and callbacks executed.

There are some cases though in which kdump users might want to
allow panic notifier callbacks to execute _before_ the kexec to
the crash kernel, for a variety of reasons - for example, users
may think kexec is very prone to fail and want to give a chance
to kmsg dumpers to run (and save logs using pstore), or maybe
some panic notifier is required to properly quiesce some hardware
that must be used to the crash kernel. For these cases, we have
the kernel parameter "crash_kexec_post_notifiers".

But there's a problem: currently it's an "all-or-nothing" situation,
the kdump user choice is either to execute all panic notifiers or
none of them. Given that panic notifiers may increase the risk of a
kdump failure, this is a tough decision and may affect the debug of
hard to reproduce bugs, if for some reason the user choice is to
enable panic notifiers, but kdump then fails.

So, this patch aims to ease this decision: we hereby introduce a filter
for the panic notifier list, in which users may select specifically
which callbacks they wish to run, allowing a safer kdump. The allowlist
should be provided using the parameter "panic_notifier_filter=a,b,..."
where a, b are valid callback names. Invalid symbols are discarded.

Currently up to 16 symbols may be passed in this list, we consider
that this numbers allows enough flexibility (and no matter what
architecture is used, at most 30 panic callbacks are registered).
In an experiment using a qemu x86 virtual machine, by default only
six callbacks are registered in the panic notifier list.
Once a valid callback name is provided in the list, such function
is allowed to be registered/unregistered in the panic_notifier_list;
all other panic callbacks are ignored. Notice that this filter is
only for the panic notifiers and has no effect in the other notifiers.

Signed-off-by: Guilherme G. Piccoli <gpiccoli@xxxxxxxxxx>
---



V4:

* Add some more clean-up suggestion from Andy (thanks).


V3:

* Implemented Alan's suggestion (thanks!), simplifying the check code
in the notifiers register/unregister functions. Notice that the
suggestion was missing a negative in the check function, I even
renamed it now, to be more clear:
s/is_panic_notifier_filtered/should_register_panic_notifier

The condition is !(A && B && C), and C is the check function, that
returns true when a symbol *is found* in the notifier filter; hence
we need to invert that here, as you can see in the code.


* Implemented Andy's suggestion (thanks!), to reduce the code in
the parameter parsing loop. Notice that strsep() modifies the buffer,
so not sure if it was a typo but the correct code here is:

single_param = strsep(&full_param_buffer, ",");


* "Bumped" the log output of the parsing function: users should be
warned in the errors (invalid symbol or exceeded entries) and
informed (pr_info) in case the parsing succeeded - I think pr_debug
was useless there.

Cheers,

Guilherme



 .../admin-guide/kernel-parameters.txt         | 14 +++++-
 include/linux/panic_notifier.h                | 10 +++++
 kernel/notifier.c                             | 44 +++++++++++++++++--
 kernel/panic.c                                | 39 ++++++++++++++++
 4 files changed, 102 insertions(+), 5 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 2fba82431efb..2dc4e98823ae 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3727,13 +3727,25 @@
 	panic_on_warn	panic() instead of WARN().  Useful to cause kdump
 			on a WARN().
 
+	panic_notifier_filter=[function-list]
+			Limit the functions registered by the panic notifier
+			infrastructure. This allowlist is composed by function
+			names, comma separated (invalid symbols are filtered
+			out). Such functionality is useful for kdump users
+			that set "crash_kexec_post_notifiers" in order to
+			execute	panic notifiers, but at the same time wish to
+			have just a subset of notifiers, not all of them. The
+			list of functions is limited to 16 entries currently.
+
 	crash_kexec_post_notifiers
 			Run kdump after running panic-notifiers and dumping
 			kmsg. This only for the users who doubt kdump always
 			succeeds in any situation.
 			Note that this also increases risks of kdump failure,
 			because some panic notifiers can make the crashed
-			kernel more unstable.
+			kernel more unstable. See the "panic_notifier_filter"
+			parameter to have more control of which notifiers to
+			execute.
 
 	parkbd.port=	[HW] Parallel port number the keyboard adapter is
 			connected to, default is 0.
diff --git a/include/linux/panic_notifier.h b/include/linux/panic_notifier.h
index 41e32483d7a7..9a96753e96d8 100644
--- a/include/linux/panic_notifier.h
+++ b/include/linux/panic_notifier.h
@@ -5,6 +5,16 @@
 #include <linux/notifier.h>
 #include <linux/types.h>
 
+/*
+ * The panic notifier filter infrastructure - each array element holds a
+ * function address, to be checked against panic_notifier register/unregister
+ * operations; these functions are allowed to be registered in the panic
+ * notifier list. This setting is useful for kdump, since users may want
+ * some panic notifiers to execute, but not all of them.
+ */
+extern unsigned long panic_nf_functions[];
+extern int panic_nf_count;
+
 extern struct atomic_notifier_head panic_notifier_list;
 
 extern bool crash_kexec_post_notifiers;
diff --git a/kernel/notifier.c b/kernel/notifier.c
index b8251dc0bc0f..4fc450bbf677 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -1,4 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-only
+#include <linux/panic_notifier.h>
 #include <linux/kdebug.h>
 #include <linux/kprobes.h>
 #include <linux/export.h>
@@ -127,12 +128,34 @@ static int notifier_call_chain_robust(struct notifier_block **nl,
  *	use a spinlock, and call_chain is synchronized by RCU (no locks).
  */
 
+/*
+ * The following helper is part of the panic notifier filter infrastructure;
+ * users can filter what functions they wish to allow being registered in the
+ * notifier system, restricted to the panic notifier. This is useful for kdump
+ * for example, when some notifiers are relevant but running all of them imposes
+ * risks to the kdump kernel reliability.
+ */
+static bool should_register_panic_notifier(struct notifier_block *n)
+{
+	int i;
+
+	for (i = 0; i < panic_nf_count; i++) {
+		if ((unsigned long)(n->notifier_call) == panic_nf_functions[i])
+			return true;
+	}
+
+	return false;
+}
+
 /**
  *	atomic_notifier_chain_register - Add notifier to an atomic notifier chain
  *	@nh: Pointer to head of the atomic notifier chain
  *	@n: New entry in notifier chain
  *
  *	Adds a notifier to an atomic notifier chain.
+ *	If "panic_notifier_filter" is provided, we hereby filter the
+ *	panic_notifier_list and only allow registering the functions
+ *	that are present in the filter.
  *
  *	Currently always returns zero.
  */
@@ -140,10 +163,15 @@ int atomic_notifier_chain_register(struct atomic_notifier_head *nh,
 		struct notifier_block *n)
 {
 	unsigned long flags;
-	int ret;
+	int ret = 0;
 
 	spin_lock_irqsave(&nh->lock, flags);
-	ret = notifier_chain_register(&nh->head, n);
+
+	if (!(nh == &panic_notifier_list &&
+	     (panic_nf_count > 0) &&
+	     !should_register_panic_notifier(n)))
+		ret = notifier_chain_register(&nh->head, n);
+
 	spin_unlock_irqrestore(&nh->lock, flags);
 	return ret;
 }
@@ -155,6 +183,9 @@ EXPORT_SYMBOL_GPL(atomic_notifier_chain_register);
  *	@n: Entry to remove from notifier chain
  *
  *	Removes a notifier from an atomic notifier chain.
+ *	If "panic_notifier_filter" is provided, we hereby filter the
+ *	panic_notifier_list and only allow unregistering the functions
+ *	that are present in the filter.
  *
  *	Returns zero on success or %-ENOENT on failure.
  */
@@ -162,10 +193,15 @@ int atomic_notifier_chain_unregister(struct atomic_notifier_head *nh,
 		struct notifier_block *n)
 {
 	unsigned long flags;
-	int ret;
+	int ret = 0;
 
 	spin_lock_irqsave(&nh->lock, flags);
-	ret = notifier_chain_unregister(&nh->head, n);
+
+	if (!(nh == &panic_notifier_list &&
+	     (panic_nf_count > 0) &&
+	     !should_register_panic_notifier(n)))
+		ret = notifier_chain_unregister(&nh->head, n);
+
 	spin_unlock_irqrestore(&nh->lock, flags);
 	synchronize_rcu();
 	return ret;
diff --git a/kernel/panic.c b/kernel/panic.c
index cefd7d82366f..a06fcc4b1d6e 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -31,6 +31,7 @@
 #include <linux/console.h>
 #include <linux/bug.h>
 #include <linux/ratelimit.h>
+#include <linux/kallsyms.h>
 #include <linux/debugfs.h>
 #include <asm/sections.h>
 
@@ -67,6 +68,16 @@ EXPORT_SYMBOL_GPL(panic_timeout);
 #define PANIC_PRINT_ALL_PRINTK_MSG	0x00000020
 unsigned long panic_print;
 
+/*
+ * Kernel has currently < 30 panic handlers no matter the arch,
+ * based on some code counting; so 16 items seems a good amount;
+ * users that are filtering panic notifiers shouldn't add all
+ * of them in theory, that doesn't make sense...
+ */
+#define	PANIC_NF_MAX	16
+unsigned long panic_nf_functions[PANIC_NF_MAX];
+int panic_nf_count;
+
 ATOMIC_NOTIFIER_HEAD(panic_notifier_list);
 
 EXPORT_SYMBOL(panic_notifier_list);
@@ -146,6 +157,34 @@ void nmi_panic(struct pt_regs *regs, const char *msg)
 }
 EXPORT_SYMBOL(nmi_panic);
 
+static int __init panic_notifier_filter_setup(char *buf)
+{
+	char *func;
+	unsigned long addr;
+
+	while ((func = strsep(&buf, ","))) {
+		addr = kallsyms_lookup_name(func);
+		if (!addr) {
+			pr_warn("panic_notifier_filter: invalid symbol %s\n", func);
+			continue;
+		}
+
+		if (panic_nf_count < PANIC_NF_MAX) {
+			panic_nf_functions[panic_nf_count] = addr;
+			panic_nf_count++;
+			pr_info("panic_notifier_filter: added symbol %s\n", func);
+		} else {
+			pr_warn("panic_notifier_filter: exceeded maximum notifiers (%d), aborting\n",
+				PANIC_NF_MAX);
+			panic_nf_count = 0;
+			break;
+		}
+	}
+
+	return 0;
+}
+early_param("panic_notifier_filter", panic_notifier_filter_setup);
+
 static void panic_print_sys_info(void)
 {
 	if (panic_print & PANIC_PRINT_ALL_PRINTK_MSG)
-- 
2.34.1


_______________________________________________
kexec mailing list
kexec@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/kexec



[Index of Archives]     [LM Sensors]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux