On 05/10/20 at 02:22pm, Rafael Aquini wrote: > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > > > index 7bc83f3d9bdf..4a69fe49a70d 100644 > > > --- a/Documentation/admin-guide/kernel-parameters.txt > > > +++ b/Documentation/admin-guide/kernel-parameters.txt > > > @@ -3404,6 +3404,21 @@ > > > panic_on_warn panic() instead of WARN(). Useful to cause kdump > > > on a WARN(). > > > > > > + panic_on_taint= [KNL] conditionally panic() in add_taint() > > > + Format: <str> > > Changed it as 'Format: <string>' to be > > consistent with the existing other options? > > I can resubmit with the change, if it's a strong req and the surgery > cannot be done at merge time. Yeah, maybe maintainer can help adjust this, not sure who will pick it. No, it's not a strong request, people might get a little bit confusion about which format should be referred to when a new kernel option is added. > > > > > + Specifies, as a string, the TAINT flag set that will > > > + compose a bitmask for calling panic() when the kernel > > > + gets tainted. > > > + See Documentation/admin-guide/tainted-kernels.rst for > > > + details on the taint flags that users can pick to > > > + compose the bitmask to assign to panic_on_taint. > > > + When the string is prefixed with a '-' the bitmask > > > + set in panic_on_taint will be mutually exclusive > > > + with the sysctl knob kernel.tainted, and any attempt > > > + to write to that sysctl will fail with -EINVAL for > > > + any taint value that masks with the flags set for > > > + this option. > > > + > > > crash_kexec_post_notifiers > > > Run kdump after running panic-notifiers and dumping > > > kmsg. This only for the users who doubt kdump always > > > diff --git a/include/linux/kernel.h b/include/linux/kernel.h > > > index 9b7a8d74a9d6..66bc102cb59a 100644 > > > --- a/include/linux/kernel.h > > > +++ b/include/linux/kernel.h > > > @@ -528,6 +528,8 @@ extern int panic_on_oops; > > > extern int panic_on_unrecovered_nmi; > > > extern int panic_on_io_nmi; > > > extern int panic_on_warn; > > > +extern unsigned long panic_on_taint; > > > +extern bool panic_on_taint_exclusive; > > > extern int sysctl_panic_on_rcu_stall; > > > extern int sysctl_panic_on_stackoverflow; > > > > > > diff --git a/kernel/panic.c b/kernel/panic.c > > > index b69ee9e76cb2..65c62f8a1de8 100644 > > > --- a/kernel/panic.c > > > +++ b/kernel/panic.c > > > @@ -25,6 +25,7 @@ > > > #include <linux/kexec.h> > > > #include <linux/sched.h> > > > #include <linux/sysrq.h> > > > +#include <linux/ctype.h> > > > #include <linux/init.h> > > > #include <linux/nmi.h> > > > #include <linux/console.h> > > > @@ -44,6 +45,8 @@ static int pause_on_oops_flag; > > > static DEFINE_SPINLOCK(pause_on_oops_lock); > > > bool crash_kexec_post_notifiers; > > > int panic_on_warn __read_mostly; > > > +unsigned long panic_on_taint; > > > +bool panic_on_taint_exclusive = false; > > > > > > int panic_timeout = CONFIG_PANIC_TIMEOUT; > > > EXPORT_SYMBOL_GPL(panic_timeout); > > > @@ -434,6 +437,11 @@ void add_taint(unsigned flag, enum lockdep_ok lockdep_ok) > > > pr_warn("Disabling lock debugging due to kernel taint\n"); > > > > > > set_bit(flag, &tainted_mask); > > > + > > > + if (tainted_mask & panic_on_taint) { > > > + panic_on_taint = 0; > > > > This panic_on_taint resetting is redundant? It will trigger crash, do we > > need care if it's 0 or not? > > > > We might still get more than one CPU hitting a taint adding code path after > the one that tripped here called panic. To avoid multiple calls to panic, > in that particular scenario, we clear the panic_on_taint bitmask out. > Also, albeit non-frequent, we might be tracking TAINT_WARN, and still hit > a WARN_ON() in the panic / kdump path, thus incurring in a second > (and unwanted) call to panic here. Hmm, this cpu will set panic_cpu firstly, all other cpu need stop and have no chance to execute panic. But yes, clearing panic_on_taint makes code easier to understand. > > > > > + panic("panic_on_taint set ..."); > > > + } > > > } > > > EXPORT_SYMBOL(add_taint); > > > > > > @@ -686,3 +694,35 @@ static int __init oops_setup(char *s) > > > return 0; > > > } > > > early_param("oops", oops_setup); > > > + > > > +static int __init panic_on_taint_setup(char *s) > > > +{ > > > + /* we just ignore panic_on_taint if passed without flags */ > > > + if (!s) > > > + goto out; > > > + > > > + for (; *s; s++) { > > > + int i; > > > + > > > + if (*s == '-') { > > > + panic_on_taint_exclusive = true; > > > + continue; > > > + } > > > + > > > + for (i = 0; i < TAINT_FLAGS_COUNT; i++) { > > > + if (toupper(*s) == taint_flags[i].c_true) { > > > + set_bit(i, &panic_on_taint); > > > + break; > > > + } > > > + } > > > > Read admin-guide/tainted-kernels.rst, but still do not get what 'G' means. > > If I specify 'panic_on_taint="G"' or 'panic_on_taint="-G"' in cmdline, > > what is expected for this customer behaviour? > > > > This will not panic the system as no taint flag gets actually set in > panic_on_taint bitmask for G. > > G is the counterpart of P, and appears on print_tainted() whenever > TAINT_PROPRIETARY_MODULE is not set. panic_on_taint doesn't set > anything for G, as it doesn't represent any taint, but the lack > of one particular taint, instead. > > (apparently, TAINT_PROPRIETARY_MODULE is the only taint flag > that follows that pattern of having an extra assigned letter > that means its absence, and perhaps it should be removed) Yeah, agree. I will make a draft patch to remove it, see if there's objection from people.