----- On Feb 16, 2016, at 2:49 PM, rostedt rostedt@xxxxxxxxxxx wrote: > From: "Steven Rostedt (Red Hat)" <rostedt@xxxxxxxxxxx> > > The tracepoint infrastructure uses RCU sched protection to enable and > disable tracepoints safely. There are some instances where tracepoints are > used in infrastructure code (like kfree()) that get called after a CPU is > going offline, and perhaps when it is coming back online but hasn't been > registered yet. > > This can probuce the following warning: > > [ INFO: suspicious RCU usage. ] > 4.4.0-00006-g0fe53e8-dirty #34 Tainted: G S > ------------------------------- > include/trace/events/kmem.h:141 suspicious rcu_dereference_check() usage! > > other info that might help us debug this: > > RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1 > no locks held by swapper/8/0. > > stack backtrace: > CPU: 8 PID: 0 Comm: swapper/8 Tainted: G S > 4.4.0-00006-g0fe53e8-dirty #34 > Call Trace: > [c0000005b76c78d0] [c0000000008b9540] .dump_stack+0x98/0xd4 (unreliable) > [c0000005b76c7950] [c00000000010c898] .lockdep_rcu_suspicious+0x108/0x170 > [c0000005b76c79e0] [c00000000029adc0] .kfree+0x390/0x440 > [c0000005b76c7a80] [c000000000055f74] .destroy_context+0x44/0x100 > [c0000005b76c7b00] [c0000000000934a0] .__mmdrop+0x60/0x150 > [c0000005b76c7b90] [c0000000000e3ff0] .idle_task_exit+0x130/0x140 > [c0000005b76c7c20] [c000000000075804] .pseries_mach_cpu_die+0x64/0x310 > [c0000005b76c7cd0] [c000000000043e7c] .cpu_die+0x3c/0x60 > [c0000005b76c7d40] [c0000000000188d8] .arch_cpu_idle_dead+0x28/0x40 > [c0000005b76c7db0] [c000000000101e6c] .cpu_startup_entry+0x50c/0x560 > [c0000005b76c7ed0] [c000000000043bd8] .start_secondary+0x328/0x360 > [c0000005b76c7f90] [c000000000008a6c] start_secondary_prolog+0x10/0x14 > > This warning is not a false positive either. RCU is not protecting code that > is being executed while the CPU is offline. > > Instead of playing "whack-a-mole(TM)" and adding conditional statements to > the tracepoints we find that are used in this instance, simply add a > cpu_online() test to the tracepoint code where the tracepoint will be > ignored if the CPU is offline. > > Use of raw_smp_processor_id() is fine, as there should never be a case where > the tracepoint code goes from running on a CPU that is online and suddenly > gets migrated to a CPU that is offline. > > Link: > http://lkml.kernel.org/r/1455387773-4245-1-git-send-email-kda@xxxxxxxxxxxxxxxxx If I get this right, you are proposing to "hide" events happening during CPU hot-unplug on dying CPUs from the tracers to fix an issue caused by interaction of RCU-sched (used for Tracepoint synchronization) wrt CPU hotplug. Removing tracing visibility of hot-unplug events seems to be an unwelcome side-effect. I don't know how far Thomas Gleixner got in his overhaul of CPU hotplug, but he might have something to say about this, as I believe he would be the first user concerned. Thoughts ? Thanks, Mathieu > > Reported-by: Denis Kirjanov <kda@xxxxxxxxxxxxxxxxx> > Fixes: 97e1c18e8d17b ("tracing: Kernel Tracepoints") > Cc: stable@xxxxxxxxxxxxxxx # v2.6.28+ > Signed-off-by: Steven Rostedt <rostedt@xxxxxxxxxxx> > --- > include/linux/tracepoint.h | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h > index acd522a91539..acfdbf353a0b 100644 > --- a/include/linux/tracepoint.h > +++ b/include/linux/tracepoint.h > @@ -14,8 +14,10 @@ > * See the file COPYING for more details. > */ > > +#include <linux/smp.h> > #include <linux/errno.h> > #include <linux/types.h> > +#include <linux/cpumask.h> > #include <linux/rcupdate.h> > #include <linux/tracepoint-defs.h> > > @@ -132,6 +134,9 @@ extern void syscall_unregfunc(void); > void *it_func; \ > void *__data; \ > \ > + if (!cpu_online(raw_smp_processor_id())) \ > + return; \ > + \ > if (!(cond)) \ > return; \ > prercu; \ > -- > 2.6.4 -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html