On Wed, Jan 25, 2023 at 10:40:17AM +0100, Peter Zijlstra wrote: > On Wed, Jan 25, 2023 at 10:35:16AM +0100, Peter Zijlstra wrote: > > tip/sched/core contains the following patch addressing this: > > > > --- > > commit 9aedeaed6fc6fe8452b9b8225e95cc2b8631ff91 > > Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > > Date: Thu Jan 12 20:43:49 2023 +0100 > > > > tracing, hardirq: No moar _rcuidle() tracing > > > > Robot reported that trace_hardirqs_{on,off}() tickle the forbidden > > _rcuidle() tracepoint through local_irq_{en,dis}able(). > > > > For 'sane' configs, these calls will only happen with RCU enabled and > > as such can use the regular tracepoint. This also means it's possible > > to trace them from NMI context again. > > > > Reported-by: kernel test robot <lkp@xxxxxxxxx> > > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> > > Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> > > Link: https://lore.kernel.org/r/20230112195541.477416709@xxxxxxxxxxxxx > > > > diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c > > index 629f2854e12b..f992444a0b1f 100644 > > --- a/kernel/trace/trace_preemptirq.c > > +++ b/kernel/trace/trace_preemptirq.c > > @@ -19,6 +19,20 @@ > > /* Per-cpu variable to prevent redundant calls when IRQs already off */ > > static DEFINE_PER_CPU(int, tracing_irq_cpu); > > > > +/* > > + * Use regular trace points on architectures that implement noinstr > > + * tooling: these calls will only happen with RCU enabled, which can > > + * use a regular tracepoint. > > + * > > + * On older architectures, use the rcuidle tracing methods (which > > + * aren't NMI-safe - so exclude NMI contexts): > > + */ > > +#ifdef CONFIG_ARCH_WANTS_NO_INSTR > > +#define trace(point) trace_##point > > +#else > > +#define trace(point) if (!in_nmi()) trace_##point##_rcuidle > > +#endif > > + > > /* > > * Like trace_hardirqs_on() but without the lockdep invocation. This is > > * used in the low level entry code where the ordering vs. RCU is important > > For some reason I missed the trace_preempt_{on,off} things, so that then > gets the below on top or so. > > diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c > index f992444a0b1f..ea96b41c8838 100644 > --- a/kernel/trace/trace_preemptirq.c > +++ b/kernel/trace/trace_preemptirq.c > @@ -100,15 +100,13 @@ NOKPROBE_SYMBOL(trace_hardirqs_off); > > void trace_preempt_on(unsigned long a0, unsigned long a1) > { > - if (!in_nmi()) > - trace_preempt_enable_rcuidle(a0, a1); > + trace(preempt_enable)(a0, a1); > tracer_preempt_on(a0, a1); > } > > void trace_preempt_off(unsigned long a0, unsigned long a1) > { > - if (!in_nmi()) > - trace_preempt_disable_rcuidle(a0, a1); > + trace(preempt_disable)(a0, a1); > tracer_preempt_off(a0, a1); > } > #endif I've tested this fixlet atop this series (itself atop tip/sched/core) with a full-fat ftrace config and the ftrace selftests, and that all runs cleanly. FWIW, if you spin this as a patch: Tested-by: Mark Rutland <mark.rutland@xxxxxxx> Without the fixlet I get splats on both arm64 and x86_64, e.g. On arm64: | ------------[ cut here ]------------ | WARNING: CPU: 1 PID: 1162 at include/trace/events/preemptirq.h:55 trace_preempt_on+0x68/0x70 | Modules linked in: | CPU: 1 PID: 1162 Comm: ftracetest Not tainted 6.2.0-rc1-00100-g1066815869f5 #1 | Hardware name: linux,dummy-virt (DT) | pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : trace_preempt_on+0x68/0x70 | lr : preempt_count_sub+0xb4/0xf0 | sp : ffff80000e04ba70 | x29: ffff80000e04ba70 x28: ffff80000ade09e8 x27: ffff800009d0c960 | x26: ffff000007c97e10 x25: ffff000007c97e00 x24: ffff80000adf1410 | x23: ffff80000e04bb80 x22: ffff80000a9aabb0 x21: ffff8000080a5cf4 | x20: ffff8000080a5cf4 x19: 0000000000000001 x18: 0000000000000000 | x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 | x14: 0000000000000028 x13: 0000000000004320 x12: 0000000000000361 | x11: 0000000000000361 x10: 0000000000040361 x9 : ffff8000080a5cf4 | x8 : ffff80000ae42a18 x7 : 0000000000000000 x6 : 0000000000000001 | x5 : ffff80000a8f14e8 x4 : 0000000000000001 x3 : 0000000000000000 | x2 : 0000000000000007 x1 : ffff8000080a5cf4 x0 : ffff8000080a5cf4 | Call trace: | trace_preempt_on+0x68/0x70 | preempt_count_sub+0xb4/0xf0 | percpu_up_read.constprop.0+0xc4/0x180 | cpus_read_unlock+0x18/0x24 | static_key_enable+0x2c/0x40 | tracepoint_add_func+0x330/0x3dc | tracepoint_probe_register+0x74/0xc0 | trace_event_reg+0x8c/0xa0 | __ftrace_event_enable_disable+0x174/0x4d0 | __ftrace_set_clr_event_nolock+0xe0/0x150 | ftrace_set_clr_event+0x90/0x13c | ftrace_event_write+0xd4/0x120 | vfs_write+0xcc/0x2f0 | ksys_write+0x78/0x110 | __arm64_sys_write+0x24/0x30 | invoke_syscall+0x50/0x120 | el0_svc_common.constprop.0+0x68/0x124 | do_el0_svc+0x40/0xbc | el0_svc+0x48/0xc0 | el0t_64_sync_handler+0xf4/0x120 | el0t_64_sync+0x190/0x194 | irq event stamp: 69662 | hardirqs last enabled at (69661): [<ffff8000092d63f4>] _raw_spin_unlock_irqrestore+0x80/0xa0 | hardirqs last disabled at (69662): [<ffff8000092c4d64>] el1_dbg+0x24/0x90 | softirqs last enabled at (69564): [<ffff800008010b08>] __do_softirq+0x448/0x5bc | softirqs last disabled at (69555): [<ffff800008017288>] ____do_softirq+0x18/0x24 | ---[ end trace 0000000000000000 ]--- | ------------[ cut here ]------------ | WARNING: CPU: 1 PID: 1162 at include/trace/events/preemptirq.h:51 trace_preempt_off+0x68/0xb0 | Modules linked in: | CPU: 1 PID: 1162 Comm: ftracetest Tainted: G W 6.2.0-rc1-00100-g1066815869f5 #1 | Hardware name: linux,dummy-virt (DT) | pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : trace_preempt_off+0x68/0xb0 | lr : preempt_count_add+0xa0/0xc0 | sp : ffff80000e04ba80 | x29: ffff80000e04ba80 x28: ffff80000ade09e8 x27: ffff800009d0c970 | x26: ffff000007c97e90 x25: ffff000007c97e80 x24: ffff80000adf13c8 | x23: ffff80000e04bb80 x22: ffff80000a9aabb0 x21: ffff80000b8c8d18 | x20: ffff8000080a5c68 x19: ffff8000080a5c68 x18: 0000000000000000 | x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 | x14: 0000000000000028 x13: 0000000000004320 x12: 0000000000000361 | x11: 0000000000000361 x10: 0000000000040361 x9 : ffff8000080a5c68 | x8 : ffff80000ae42a18 x7 : 0000000000000000 x6 : 0000000000000003 | x5 : 000000003a9e6115 x4 : 0000000000000001 x3 : 0000000000000000 | x2 : 0000000000000007 x1 : ffff8000080a5c68 x0 : ffff8000080a5c68 | Call trace: | trace_preempt_off+0x68/0xb0 | preempt_count_add+0xa0/0xc0 | percpu_up_read.constprop.0+0x38/0x180 | cpus_read_unlock+0x18/0x24 | static_key_enable+0x2c/0x40 | tracepoint_add_func+0x330/0x3dc | tracepoint_probe_register+0x74/0xc0 | trace_event_reg+0x8c/0xa0 | __ftrace_event_enable_disable+0x174/0x4d0 | __ftrace_set_clr_event_nolock+0xe0/0x150 | ftrace_set_clr_event+0x90/0x13c | ftrace_event_write+0xd4/0x120 | vfs_write+0xcc/0x2f0 | ksys_write+0x78/0x110 | __arm64_sys_write+0x24/0x30 | invoke_syscall+0x50/0x120 | el0_svc_common.constprop.0+0x68/0x124 | do_el0_svc+0x40/0xbc | el0_svc+0x48/0xc0 | el0t_64_sync_handler+0xf4/0x120 | el0t_64_sync+0x190/0x194 | irq event stamp: 69686 | hardirqs last enabled at (69685): [<ffff8000092c4db8>] el1_dbg+0x78/0x90 | hardirqs last disabled at (69686): [<ffff8000092c4d64>] el1_dbg+0x24/0x90 | softirqs last enabled at (69670): [<ffff800008010b08>] __do_softirq+0x448/0x5bc | softirqs last disabled at (69665): [<ffff800008017288>] ____do_softirq+0x18/0x24 | ---[ end trace 0000000000000000 ]--- On x86_64: | ------------[ cut here ]------------ | WARNING: CPU: 0 PID: 1083 at include/trace/events/preemptirq.h:55 trace_preempt_on+0x31/0x40 | Modules linked in: | CPU: 0 PID: 1083 Comm: ftracetest Not tainted 6.2.0-rc1-00100-g1066815869f5 #3 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 | RIP: 0010:trace_preempt_on+0x31/0x40 | Code: 1d d6 45 4c a9 00 00 f0 00 74 05 e9 29 11 00 00 cc 90 e9 22 11 00 00 65 8b 05 07 d6 45 4c 89 c0 48 0f a3 05 c1 af b1 01 73 e1 <0f> 0b eb dd 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 0 | RSP: 0018:ffffb374c0ce7ba8 EFLAGS: 00010247 | RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 | RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffffffb49e9c1d | RBP: 0000000000000000 R08: 000000004ea10906 R09: 000000005e02388f | R10: 00000000f5e02388 R11: 00000000d269a5f5 R12: ffff98620005c780 | R13: ffffffffb3a363d0 R14: 0000000000000a37 R15: 0000000000000a38 | FS: 00007fc812d836a0(0000) GS:ffff98623bc00000(0000) knlGS:0000000000000000 | CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 | CR2: 0000000000642e58 CR3: 00000001039c6005 CR4: 0000000000370ef0 | DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 | DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 | Call Trace: | <TASK> | preempt_count_sub+0xa3/0xe0 | _raw_spin_unlock+0x2d/0x50 | ? trace_preempt_on+0x17/0x40 | __text_poke+0x326/0x4c0 | ? trace_preempt_on+0x17/0x40 | text_poke_bp_batch+0x7f/0x360 | text_poke_finish+0x1f/0x30 | arch_jump_label_transform_apply+0x1c/0x30 | static_key_enable_cpuslocked+0x65/0xa0 | static_key_enable+0x1a/0x20 | tracepoint_add_func+0x32c/0x430 | ? __pfx_trace_event_raw_event_preemptirq_template+0x10/0x10 | tracepoint_probe_register+0x78/0xb0 | ? __pfx_trace_event_raw_event_preemptirq_template+0x10/0x10 | __ftrace_event_enable_disable+0x180/0x250 | __ftrace_set_clr_event_nolock+0xe3/0x130 | ftrace_set_clr_event+0x74/0xf0 | ftrace_event_write+0xdd/0x110 | vfs_write+0xee/0x510 | ksys_write+0x75/0x100 | do_syscall_64+0x3e/0x90 | entry_SYSCALL_64_after_hwframe+0x72/0xdc | RIP: 0033:0x7fc812d09103 | Code: 8b 7c 24 08 89 c5 e8 c5 ff ff ff 89 ef 89 44 24 08 e8 81 bc 02 00 8b 44 24 08 48 83 c4 10 5d c3 48 63 ff b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 76 10 48 8b 15 4e fd 05 00 f7 d8 64 3 | RSP: 002b:00007ffcb12972f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 | RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fc812d09103 | RDX: 0000000000000004 RSI: 0000000000645050 RDI: 0000000000000001 | RBP: 0000000000645050 R08: fefefefefefefeff R09: ffffffff00000000 | R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000004 | R13: 00007fc812d83690 R14: 0000000000000001 R15: 0000000000000000 | </TASK> | irq event stamp: 57597 | hardirqs last enabled at (57609): [<ffffffffb3b1384e>] __up_console_sem+0x5e/0x80 | hardirqs last disabled at (57620): [<ffffffffb3b13833>] __up_console_sem+0x43/0x80 | softirqs last enabled at (57556): [<ffffffffb49eae94>] __do_softirq+0x354/0x4d7 | softirqs last disabled at (57543): [<ffffffffb3a92ad7>] irq_exit_rcu+0xc7/0x140 | ---[ end trace 0000000000000000 ]--- | ------------[ cut here ]------------ | WARNING: CPU: 0 PID: 1083 at include/trace/events/preemptirq.h:51 trace_preempt_off+0x31/0x40 | Modules linked in: | CPU: 0 PID: 1083 Comm: ftracetest Tainted: G W 6.2.0-rc1-00100-g1066815869f5 #3 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 | RIP: 0010:trace_preempt_off+0x31/0x40 | Code: cd d5 45 4c a9 00 00 f0 00 74 05 e9 f9 11 00 00 cc 90 e9 f2 11 00 00 65 8b 05 b7 d5 45 4c 89 c0 48 0f a3 05 71 af b1 01 73 e1 <0f> 0b eb dd 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 0 | RSP: 0018:ffffb374c0ce7c18 EFLAGS: 00010247 | RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000001 | RDX: 0000000000000000 RSI: ffffffffb3b74639 RDI: ffffffffb3b74639 | RBP: ffffffffb56faa10 R08: ffffffffb56faa10 R09: 000000005e02388f | R10: 00000000f5e02388 R11: 00000000d269a5f5 R12: ffffffffb3a36d20 | R13: 0000000000000000 R14: 0000000000000000 R15: ffff9862038a0780 | FS: 00007fc812d836a0(0000) GS:ffff98623bc00000(0000) knlGS:0000000000000000 | CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 | CR2: 0000000000642e58 CR3: 00000001039c6005 CR4: 0000000000370ef0 | DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 | DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 | Call Trace: | <TASK> | on_each_cpu_cond_mask+0x29/0x80 | text_poke_bp_batch+0xe7/0x360 | text_poke_finish+0x1f/0x30 | arch_jump_label_transform_apply+0x1c/0x30 | static_key_enable_cpuslocked+0x65/0xa0 | static_key_enable+0x1a/0x20 | tracepoint_add_func+0x32c/0x430 | ? __pfx_trace_event_raw_event_preemptirq_template+0x10/0x10 | tracepoint_probe_register+0x78/0xb0 | ? __pfx_trace_event_raw_event_preemptirq_template+0x10/0x10 | __ftrace_event_enable_disable+0x180/0x250 | __ftrace_set_clr_event_nolock+0xe3/0x130 | ftrace_set_clr_event+0x74/0xf0 | ftrace_event_write+0xdd/0x110 | vfs_write+0xee/0x510 | ksys_write+0x75/0x100 | do_syscall_64+0x3e/0x90 | entry_SYSCALL_64_after_hwframe+0x72/0xdc | RIP: 0033:0x7fc812d09103 | Code: 8b 7c 24 08 89 c5 e8 c5 ff ff ff 89 ef 89 44 24 08 e8 81 bc 02 00 8b 44 24 08 48 83 c4 10 5d c3 48 63 ff b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 76 10 48 8b 15 4e fd 05 00 f7 d8 64 3 | RSP: 002b:00007ffcb12972f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 | RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fc812d09103 | RDX: 0000000000000004 RSI: 0000000000645050 RDI: 0000000000000001 | RBP: 0000000000645050 R08: fefefefefefefeff R09: ffffffff00000000 | R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000004 | R13: 00007fc812d83690 R14: 0000000000000001 R15: 0000000000000000 | </TASK> | irq event stamp: 58293 | hardirqs last enabled at (58303): [<ffffffffb3b1384e>] __up_console_sem+0x5e/0x80 | hardirqs last disabled at (58314): [<ffffffffb3b13833>] __up_console_sem+0x43/0x80 | softirqs last enabled at (57820): [<ffffffffb49eae94>] __do_softirq+0x354/0x4d7 | softirqs last disabled at (57811): [<ffffffffb3a92ad7>] irq_exit_rcu+0xc7/0x140 | ---[ end trace 0000000000000000 ]--- Thanks, Mark.