Jiri Olsa <olsajiri@xxxxxxxxx> 于2022年11月11日周五 22:45写道: > > On Thu, Nov 10, 2022 at 12:53:16AM +0100, Jiri Olsa wrote: > > SNIP > > > > > > > --- > > > > > > diff --git a/include/trace/bpf_probe.h b/include/trace/bpf_probe.h > > > > > > index 6a13220d2d27..5a354ae096e5 100644 > > > > > > --- a/include/trace/bpf_probe.h > > > > > > +++ b/include/trace/bpf_probe.h > > > > > > @@ -78,11 +78,15 @@ > > > > > > #define CAST_TO_U64(...) CONCATENATE(__CAST, COUNT_ARGS(__VA_ARGS__))(__VA_ARGS__) > > > > > > > > > > > > #define __BPF_DECLARE_TRACE(call, proto, args) \ > > > > > > +static DEFINE_PER_CPU(int, __bpf_trace_tp_active_##call); \ > > > > > > static notrace void \ > > > > > > __bpf_trace_##call(void *__data, proto) \ > > > > > > { \ > > > > > > struct bpf_prog *prog = __data; \ > > > > > > - CONCATENATE(bpf_trace_run, COUNT_ARGS(args))(prog, CAST_TO_U64(args)); \ > > > > > > + \ > > > > > > + if (likely(this_cpu_inc_return(__bpf_trace_tp_active_##call) == 1)) \ > > > > > > + CONCATENATE(bpf_trace_run, COUNT_ARGS(args))(prog, CAST_TO_U64(args)); \ > > > > > > + this_cpu_dec(__bpf_trace_tp_active_##call); \ > > > > > > } > > > > > > > > > > This approach will hurt real use cases where > > > > > multiple and different raw_tp progs run on the same cpu. > > > > > > > > would the 2 levels of nesting help in here? > > > > > > > > I can imagine the change above would break use case where we want to > > > > trigger tracepoints in irq context that interrupted task that's already > > > > in the same tracepoint > > > > > > > > with 2 levels of nesting we would trigger that tracepoint from irq and > > > > would still be safe with bpf_bprintf_prepare buffer > > > > > > How would these 2 levels work? > > > > just using the active counter like below, but I haven't tested it yet > > > > jirka > > seems to be working > Hao Sun, could you please test this patch? > Hi Jirka, I've tested the proposed patch, the warning from bpf_bprintf_prepare will not be triggered with the patch, but the reproducer can still trigger the following warning. My test was conducted on: commit: f67dd6ce0723 Merge tag 'slab-for-6.1-rc4-fixes' git tree: upstream kernel config: https://pastebin.com/raw/sE5QK5HL C reproducer: https://pastebin.com/raw/X96ASi27 console log *before* the patch: https://pastebin.com/raw/eSCUNFrd console log *after* the patch: https://pastebin.com/raw/tzcmdWZt ============================================ WARNING: possible recursive locking detected 6.1.0-rc4-00020-gf67dd6ce0723-dirty #11 Not tainted -------------------------------------------- a.out/6703 is trying to acquire lock: ffffffff8ce0ea18 (trace_printk_lock){....}-{2:2}, at: ____bpf_trace_printk kernel/trace/bpf_trace.c:390 [inline] ffffffff8ce0ea18 (trace_printk_lock){....}-{2:2}, at: bpf_trace_printk+0xcf/0x170 kernel/trace/bpf_trace.c:376 but task is already holding lock: ffffffff8ce0ea18 (trace_printk_lock){....}-{2:2}, at: ____bpf_trace_printk kernel/trace/bpf_trace.c:390 [inline] ffffffff8ce0ea18 (trace_printk_lock){....}-{2:2}, at: bpf_trace_printk+0xcf/0x170 kernel/trace/bpf_trace.c:376 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(trace_printk_lock); lock(trace_printk_lock); *** DEADLOCK *** May be due to missing lock nesting notation 4 locks held by a.out/6703: #0: ffffffff8ce02fc8 (event_mutex){+.+.}-{3:3}, at: __ftrace_set_clr_event kernel/trace/trace_events.c:1060 [inline] #0: ffffffff8ce02fc8 (event_mutex){+.+.}-{3:3}, at: trace_set_clr_event+0xd5/0x140 kernel/trace/trace_events.c:1126 #1: ffffffff8cd85f00 (rcu_read_lock){....}-{1:2}, at: __bpf_trace_run kernel/trace/bpf_trace.c:2249 [inline] #1: ffffffff8cd85f00 (rcu_read_lock){....}-{1:2}, at: bpf_trace_run2+0xb9/0x3d0 kernel/trace/bpf_trace.c:2293 #2: ffffffff8ce0ea18 (trace_printk_lock){....}-{2:2}, at: ____bpf_trace_printk kernel/trace/bpf_trace.c:390 [inline] #2: ffffffff8ce0ea18 (trace_printk_lock){....}-{2:2}, at: bpf_trace_printk+0xcf/0x170 kernel/trace/bpf_trace.c:376 #3: ffffffff8cd85f00 (rcu_read_lock){....}-{1:2}, at: __bpf_trace_run kernel/trace/bpf_trace.c:2249 [inline] #3: ffffffff8cd85f00 (rcu_read_lock){....}-{1:2}, at: bpf_trace_run2+0xb9/0x3d0 kernel/trace/bpf_trace.c:2293 stack backtrace: CPU: 7 PID: 6703 Comm: a.out Not tainted 6.1.0-rc4-00020-gf67dd6ce0723-dirty #11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.0-3-3 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xfc/0x174 lib/dump_stack.c:106 print_deadlock_bug kernel/locking/lockdep.c:2990 [inline] check_deadlock kernel/locking/lockdep.c:3033 [inline] validate_chain kernel/locking/lockdep.c:3818 [inline] __lock_acquire.cold+0x119/0x3b9 kernel/locking/lockdep.c:5055 lock_acquire kernel/locking/lockdep.c:5668 [inline] lock_acquire+0x1dc/0x600 kernel/locking/lockdep.c:5633 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x36/0x50 kernel/locking/spinlock.c:162 ____bpf_trace_printk kernel/trace/bpf_trace.c:390 [inline] bpf_trace_printk+0xcf/0x170 kernel/trace/bpf_trace.c:376 ___bpf_prog_run+0x42e5/0x8de0 kernel/bpf/core.c:1818 __bpf_prog_run32+0x99/0xe0 kernel/bpf/core.c:2041 bpf_dispatcher_nop_func include/linux/bpf.h:964 [inline] __bpf_prog_run include/linux/filter.h:600 [inline] bpf_prog_run include/linux/filter.h:607 [inline] __bpf_trace_run kernel/trace/bpf_trace.c:2254 [inline] bpf_trace_run2+0x14d/0x3d0 kernel/trace/bpf_trace.c:2293 __bpf_trace_contention_begin+0xc8/0x120 include/trace/events/lock.h:95 __traceiter_contention_begin+0x56/0x90 include/trace/events/lock.h:95 trace_contention_begin.constprop.0+0x143/0x240 include/trace/events/lock.h:95 __pv_queued_spin_lock_slowpath+0xfd/0xc70 kernel/locking/qspinlock.c:405 pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:591 [inline] queued_spin_lock_slowpath arch/x86/include/asm/qspinlock.h:51 [inline] queued_spin_lock include/asm-generic/qspinlock.h:114 [inline] do_raw_spin_lock+0x20a/0x2b0 kernel/locking/spinlock_debug.c:115 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:111 [inline] _raw_spin_lock_irqsave+0x3e/0x50 kernel/locking/spinlock.c:162 ____bpf_trace_printk kernel/trace/bpf_trace.c:390 [inline] bpf_trace_printk+0xcf/0x170 kernel/trace/bpf_trace.c:376 ___bpf_prog_run+0x42e5/0x8de0 kernel/bpf/core.c:1818 __bpf_prog_run32+0x99/0xe0 kernel/bpf/core.c:2041 bpf_dispatcher_nop_func include/linux/bpf.h:964 [inline] __bpf_prog_run include/linux/filter.h:600 [inline] bpf_prog_run include/linux/filter.h:607 [inline] __bpf_trace_run kernel/trace/bpf_trace.c:2254 [inline] bpf_trace_run2+0x14d/0x3d0 kernel/trace/bpf_trace.c:2293 __bpf_trace_contention_begin+0xc8/0x120 include/trace/events/lock.h:95 __traceiter_contention_begin+0x56/0x90 include/trace/events/lock.h:95 trace_contention_begin+0x129/0x1e0 include/trace/events/lock.h:95 __mutex_lock_common kernel/locking/mutex.c:605 [inline] __mutex_lock+0x15a/0x12d0 kernel/locking/mutex.c:747 __ftrace_set_clr_event kernel/trace/trace_events.c:1060 [inline] trace_set_clr_event+0xd5/0x140 kernel/trace/trace_events.c:1126 __set_printk_clr_event kernel/trace/bpf_trace.c:419 [inline] bpf_get_trace_printk_proto kernel/trace/bpf_trace.c:425 [inline] bpf_tracing_func_proto+0x476/0x5a0 kernel/trace/bpf_trace.c:1422 raw_tp_prog_func_proto+0x4f/0x60 kernel/trace/bpf_trace.c:1885 check_helper_call+0x20a/0x94d0 kernel/bpf/verifier.c:7255 do_check kernel/bpf/verifier.c:12384 [inline] do_check_common+0x6b3e/0xdf20 kernel/bpf/verifier.c:14643 do_check_main kernel/bpf/verifier.c:14706 [inline] bpf_check+0x714c/0xa9b0 kernel/bpf/verifier.c:15276 bpf_prog_load+0xfb1/0x2110 kernel/bpf/syscall.c:2605 __sys_bpf+0xaba/0x5520 kernel/bpf/syscall.c:4965 __do_sys_bpf kernel/bpf/syscall.c:5069 [inline] __se_sys_bpf kernel/bpf/syscall.c:5067 [inline] __x64_sys_bpf+0x74/0xb0 kernel/bpf/syscall.c:5067 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x34/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fe9f4ee4469 Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ff 49 2b 00 f7 d8 64 89 01 48 RSP: 002b:00007fff7c845438 EFLAGS: 00000246 ORIG_RAX: 0000000000000141 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fe9f4ee4469 RDX: 0000000000000048 RSI: 0000000020000080 RDI: 0000000000000005 RBP: 00007fff7c845450 R08: 00007fe9f4f2e160 R09: 00007fff7c845450 R10: 0000000020001cc2 R11: 0000000000000246 R12: 000055643c800c00 R13: 00007fff7c845570 R14: 0000000000000000 R15: 0000000000000000 </TASK> Regards Hao > thanks, > jirka > > > > > > --- > > diff --git a/include/trace/bpf_probe.h b/include/trace/bpf_probe.h > > index 6a13220d2d27..ca5dd34478b7 100644 > > --- a/include/trace/bpf_probe.h > > +++ b/include/trace/bpf_probe.h > > @@ -78,11 +78,15 @@ > > #define CAST_TO_U64(...) CONCATENATE(__CAST, COUNT_ARGS(__VA_ARGS__))(__VA_ARGS__) > > > > #define __BPF_DECLARE_TRACE(call, proto, args) \ > > +static DEFINE_PER_CPU(int, __bpf_trace_tp_active_##call); \ > > static notrace void \ > > __bpf_trace_##call(void *__data, proto) \ > > { \ > > struct bpf_prog *prog = __data; \ > > - CONCATENATE(bpf_trace_run, COUNT_ARGS(args))(prog, CAST_TO_U64(args)); \ > > + \ > > + if (likely(this_cpu_inc_return(__bpf_trace_tp_active_##call) < 3)) \ > > + CONCATENATE(bpf_trace_run, COUNT_ARGS(args))(prog, CAST_TO_U64(args)); \ > > + this_cpu_dec(__bpf_trace_tp_active_##call); \ > > } > > > > #undef DECLARE_EVENT_CLASS