Re: [RFC PATCH 6/6] tracing: use sched-RCU instead of SRCU for rcuidle tracepoints

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



----- On Oct 26, 2020, at 4:44 PM, rostedt rostedt@xxxxxxxxxxx wrote:

> On Mon, 26 Oct 2020 10:28:07 -0400 (EDT)
> Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote:
> 
>> I agree with Peter. Removing the trace_.*_rcuidle weirdness from the tracepoint
>> API and fixing all callers to ensure they trace from a context where RCU is
>> watching would simplify instrumentation of the Linux kernel, thus making it
>> harder
>> for subtle bugs to hide and be unearthed only when tracing is enabled. This is
> 
> Note, the lockdep RCU checking of a tracepoint is outside of it being
> enabled or disable. So if a non rcuidle() tracepoint is in a location that
> RCU is not watching, it will complain loudly, even if you don't enable that
> tracepoint.
> 
>> AFAIU the general approach Thomas Gleixner has been aiming for recently, and I
>> think it is a good thing.
>> 
>> So if we consider this our target, and that the current state of things is that
>> we need to have RCU watching around callback invocation, then removing the
>> dependency on SRCU seems like an overall simplification which does not regress
>> feature-wise nor speed-wise compared with what we have upstream today. The next
>> steps would then be to audit all rcuidle tracepoints and make sure the context
>> where they are placed has RCU watching already, so we can remove the tracepoint
> 
> Just remove the _rcuidle() from them, and lockdep will complain if they are
> being called without RCU watching.

That's the easy part. The patch removing rcuidle is attached to this email,
and here are the splats I get just when booting the machine (see below). The
part which takes more time is fixing those splats and figuring out how to
restructure the code. For instance, what should we do about the rcuidle
tracepoint within printk() ?

Also, how do we test rcuidle tracepoints triggered only when printk is called
from printk warnings ? We'd also need to test on arm32 boards, because some arm
architecture code uses rcuidle tracepoints as well.

As this is beyond the scope of this patch set, I can either drop this patch
entirely (it's not required for sleepable tracepoints), or keep it as an
intermediate cleanup step. Removing rcuidle tracepoints entirely is beyond
the effort Michael and I can undertake now.

=============================
WARNING: suspicious RCU usage

5.9.1+ #72 Not tainted
-----------------------------
=============================
./include/trace/events/preemptirq.h:42 suspicious rcu_dereference_check() usage!
WARNING: suspicious RCU usage

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 1
5.9.1+ #72 Not tainted
RCU used illegally from extended quiescent state!
no locks held by swapper/0/0.
-----------------------------
./include/trace/events/preemptirq.h:38 suspicious rcu_dereference_check() usage!

stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.1+ #72

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
RCU used illegally from extended quiescent state!
no locks held by swapper/1/0.
 dump_stack+0x8d/0xbb

stack backtrace:
 ? default_idle+0xa/0x20
 trace_hardirqs_on+0xed/0xf0
 default_idle+0xa/0x20
 default_idle_call+0x4f/0x1e0
 do_idle+0x1ef/0x2c0
 cpu_startup_entry+0x19/0x20
 start_kernel+0x578/0x59d
 secondary_startup_64+0xa4/0xb0
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.9.1+ #72

=============================
WARNING: suspicious RCU usage
5.9.1+ #72 Not tainted
-----------------------------
./include/trace/events/lock.h:37 suspicious rcu_dereference_check() usage!

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 1
RCU used illegally from extended quiescent state!
no locks held by swapper/0/0.

stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.1+ #72
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 dump_stack+0x8d/0xbb
 lock_acquire+0x346/0x3b0
 ? __lock_acquire+0x2e7/0x1da0
 _raw_spin_lock+0x2c/0x40
 ? vprintk_emit+0x9f/0x410
 vprintk_emit+0x9f/0x410
 printk+0x52/0x6e
 lockdep_rcu_suspicious+0x1b/0xe0
 ? default_idle+0xa/0x20
 trace_hardirqs_on+0xed/0xf0
 default_idle+0xa/0x20
 default_idle_call+0x4f/0x1e0
 do_idle+0x1ef/0x2c0
 cpu_startup_entry+0x19/0x20
 start_kernel+0x578/0x59d
 secondary_startup_64+0xa4/0xb0

=============================
WARNING: suspicious RCU usage
5.9.1+ #72 Not tainted
-----------------------------
./include/trace/events/lock.h:63 suspicious rcu_dereference_check() usage!

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 1
RCU used illegally from extended quiescent state!
1 lock held by swapper/0/0:
 #0: ffffffff97a80158 (logbuf_lock){-...}-{2:2}, at: vprintk_emit+0x9f/0x410

stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.1+ #72
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 dump_stack+0x8d/0xbb
 lock_release+0x25a/0x280
 _raw_spin_unlock+0x17/0x30
 vprintk_emit+0xdf/0x410
 printk+0x52/0x6e
 lockdep_rcu_suspicious+0x1b/0xe0
 ? default_idle+0xa/0x20
 trace_hardirqs_on+0xed/0xf0
 default_idle+0xa/0x20
 default_idle_call+0x4f/0x1e0
 do_idle+0x1ef/0x2c0
 cpu_startup_entry+0x19/0x20
 start_kernel+0x578/0x59d
 secondary_startup_64+0xa4/0xb0

=============================
WARNING: suspicious RCU usage
5.9.1+ #72 Not tainted
-----------------------------
./include/trace/events/printk.h:33 suspicious rcu_dereference_check() usage!

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 1
RCU used illegally from extended quiescent state!
2 locks held by swapper/0/0:
 #0: ffffffff97a801a0 (console_lock){+.+.}-{0:0}, at: vprintk_emit+0x126/0x410
 #1: ffffffff97a7fec0 (console_owner){....}-{0:0}, at: console_unlock+0x190/0x5d0

stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.1+ #72
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 dump_stack+0x8d/0xbb
 console_unlock+0x5ad/0x5d0
 vprintk_emit+0x133/0x410
 printk+0x52/0x6e
 lockdep_rcu_suspicious+0x1b/0xe0
 ? default_idle+0xa/0x20
 trace_hardirqs_on+0xed/0xf0
 default_idle+0xa/0x20
 default_idle_call+0x4f/0x1e0
 do_idle+0x1ef/0x2c0
 cpu_startup_entry+0x19/0x20
 start_kernel+0x578/0x59d
 secondary_startup_64+0xa4/0xb0

=============================
WARNING: suspicious RCU usage
5.9.1+ #72 Not tainted
-----------------------------
./include/linux/rcupdate.h:636 rcu_read_lock() used illegally while idle!

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 1
RCU used illegally from extended quiescent state!
4 locks held by swapper/0/0:
 #0: ffffffff97a801a0 (console_lock){+.+.}-{0:0}, at: vprintk_emit+0x126/0x410
 #1: ffffffff97a7fec0 (console_owner){....}-{0:0}, at: console_unlock+0x190/0x5d0
 #2: ffffffff97b7d598 (printing_lock){....}-{2:2}, at: vt_console_print+0x7d/0x3e0
 #3: ffffffff97a82d80 (rcu_read_lock){....}-{1:2}, at: __atomic_notifier_call_chain+0x5/0x110

stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.1+ #72
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 dump_stack+0x8d/0xbb
 __atomic_notifier_call_chain+0xd7/0x110
 vt_console_print+0x19e/0x3e0
 console_unlock+0x417/0x5d0
 vprintk_emit+0x133/0x410
 printk+0x52/0x6e
 lockdep_rcu_suspicious+0x1b/0xe0
 ? default_idle+0xa/0x20
 trace_hardirqs_on+0xed/0xf0
 default_idle+0xa/0x20
 default_idle_call+0x4f/0x1e0
 do_idle+0x1ef/0x2c0
 cpu_startup_entry+0x19/0x20
 start_kernel+0x578/0x59d
 secondary_startup_64+0xa4/0xb0

=============================
WARNING: suspicious RCU usage
5.9.1+ #72 Not tainted
-----------------------------
./include/linux/rcupdate.h:685 rcu_read_unlock() used illegally while idle!

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 1
RCU used illegally from extended quiescent state!
4 locks held by swapper/0/0:
 #0: ffffffff97a801a0 (console_lock){+.+.}-{0:0}, at: vprintk_emit+0x126/0x410
 #1: ffffffff97a7fec0 (console_owner){....}-{0:0}, at: console_unlock+0x190/0x5d0
 #2: ffffffff97b7d598 (printing_lock){....}-{2:2}, at: vt_console_print+0x7d/0x3e0
 #3: ffffffff97a82d80 (rcu_read_lock){....}-{1:2}, at: __atomic_notifier_call_chain+0x5/0x110

stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.1+ #72
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 dump_stack+0x8d/0xbb
 __atomic_notifier_call_chain+0x101/0x110
 vt_console_print+0x19e/0x3e0
 console_unlock+0x417/0x5d0
 vprintk_emit+0x133/0x410
 printk+0x52/0x6e
 lockdep_rcu_suspicious+0x1b/0xe0
 ? default_idle+0xa/0x20
 trace_hardirqs_on+0xed/0xf0
 default_idle+0xa/0x20
 default_idle_call+0x4f/0x1e0
 do_idle+0x1ef/0x2c0
 cpu_startup_entry+0x19/0x20
 start_kernel+0x578/0x59d
 secondary_startup_64+0xa4/0xb0
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 dump_stack+0x8d/0xbb
 ? rcu_idle_exit+0x32/0x40
 trace_hardirqs_off+0xe3/0xf0
 rcu_idle_exit+0x32/0x40
 default_idle_call+0x54/0x1e0
 do_idle+0x1ef/0x2c0
 ? lockdep_hardirqs_on_prepare+0xec/0x180
 cpu_startup_entry+0x19/0x20
 start_secondary+0x11c/0x160
 secondary_startup_64+0xa4/0xb0

Thanks,

Mathieu


> 
> -- Steve
> 
> 
>> rcuidle API. That would effectively remove the calls to
>> rcu_irq_{enter,exit}_irqson
>> from the tracepoint code.
>> 
>> This is however beyond the scope of the proposed patch set.
>> 
>> Thanks,
>> 
>> Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

Attachment: 0001-wip-remove-rcuidle-tracepoint-API.patch
Description: application/mbox


[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux