On Wed, Nov 02, 2022 at 03:28:47PM +0100, Jiri Olsa wrote: > On Thu, Oct 27, 2022 at 07:45:16PM +0800, Hao Sun wrote: > > Jiri Olsa <olsajiri@xxxxxxxxx> 于2022年10月27日周四 19:24写道: > > > > > > On Thu, Oct 27, 2022 at 10:27:28AM +0800, Hao Sun wrote: > > > > Hi, > > > > > > > > The following warning can be triggered with the C reproducer in the link. > > > > Syzbot also reported this several days ago, Jiri posted a patch that > > > > uses bpf prog `active` field to fix this by 05b24ff9b2cfab (bpf: > > > > Prevent bpf program recursion...) according to syzbot dashboard > > > > (https://syzkaller.appspot.com/bug?id=179313fb375161d50a98311a28b8e2fc5f7350f9). > > > > But this warning can still be triggered on 247f34f7b803 > > > > (Linux-v6.1-rc2) that already merged the patch, so it seems that this > > > > still is an issue. > > > > > > > > HEAD commit: 247f34f7b803 Linux 6.1-rc2 > > > > git tree: upstream > > > > console output: https://pastebin.com/raw/kNw8JCu5 > > > > kernel config: https://pastebin.com/raw/sE5QK5HL > > > > C reproducer: https://pastebin.com/raw/X96ASi27 > > > > > > hi, > > > right, that fix addressed that issue for single bpf program, > > > and it won't prevent if there are multiple programs hook on > > > contention_begin tracepoint and calling bpf_trace_printk, > > > > > > I'm not sure we can do something there.. will check > > > > > > do you run just the reproducer, or you load the server somehow? > > > I cannot hit the issue so far > > > > > > > Hi, > > > > Last email has format issues, resend it here. > > > > I built the kernel with the config in the link, which contains > > “CONFIG_CMDLINE="earlyprintk=serial net.ifnames=0 > > sysctl.kernel.hung_task_all_cpu_backtrace=1 panic_on_warn=1 …”, and > > boot the kernel with normal qemu setup and then the warning can be > > triggered by executing the reproducer. > > > > Also, I’m willing to test the proposed patch if any. > > fyi I reproduced that.. will check if we can do anything about that I reproduced this with set of 8 programs all hooked to contention_begin tracepoint and here's what I think is happening: all programs (prog1 .. prog8) call just bpf_trace_printk helper and I'm running 'perf bench sched messaging' to load the machine at some point some contended lock triggers trace_contention_begin: trace_contention_begin __traceiter_contention_begin <-- iterates all functions attached to tracepoint __bpf_trace_run(prog1) prog1->active = 1 bpf_prog_run(prog1) bpf_trace_printk bpf_bprintf_prepare <-- takes buffer 1 out of 3 raw_spin_lock_irqsave(trace_printk_lock) # we have global single trace_printk_lock, so we will trigger # its trace_contention_begin at some point trace_contention_begin __traceiter_contention_begin __bpf_trace_run(prog1) prog1->active block <-- prog1 is already 'running', skipping the execution __bpf_trace_run(prog2) prog2->active = 1 bpf_prog_run(prog2) bpf_trace_printk bpf_bprintf_prepare <-- takes buffer 2 out of 3 raw_spin_lock_irqsave(trace_printk_lock) trace_contention_begin __traceiter_contention_begin __bpf_trace_run(prog1) prog1->active block <-- prog1 is already 'running', skipping the execution __bpf_trace_run(prog2) prog2->active block <-- prog2 is already 'running', skipping the execution __bpf_trace_run(prog3) prog3->active = 1 bpf_prog_run(prog3) bpf_trace_printk bpf_bprintf_prepare <-- takes buffer 3 out of 3 raw_spin_lock_irqsave(trace_printk_lock) trace_contention_begin __traceiter_contention_begin __bpf_trace_run(prog1) prog1->active block <-- prog1 is already 'running', skipping the execution __bpf_trace_run(prog2) prog2->active block <-- prog2 is already 'running', skipping the execution __bpf_trace_run(prog3) prog3->active block <-- prog3 is already 'running', skipping the execution __bpf_trace_run(prog4) prog4->active = 1 bpf_prog_run(prog4) bpf_trace_printk bpf_bprintf_prepare <-- tries to take buffer 4 out of 3 -> WARNING the code path may vary based on the contention of the trace_printk_lock, so I saw different nesting within 8 programs, but all eventually ended up at 4 levels of nesting and hit the warning I think we could perhaps move the 'active' flag protection from program to the tracepoint level (in the patch below), to prevent nesting execution of the same tracepoint, so it'd look like: trace_contention_begin __traceiter_contention_begin __bpf_trace_run(prog1) { contention_begin.active = 1 bpf_prog_run(prog1) bpf_trace_printk bpf_bprintf_prepare raw_spin_lock_irqsave(trace_printk_lock) trace_contention_begin __traceiter_contention_begin __bpf_trace_run(prog1) blocked because contention_begin.active == 1 __bpf_trace_run(prog2) blocked because contention_begin.active == 1 __bpf_trace_run(prog3) ... __bpf_trace_run(prog8) blocked because contention_begin.active == 1 raw_spin_unlock_irqrestore bpf_bprintf_cleanup contention_begin.active = 0 } __bpf_trace_run(prog2) { contention_begin.active = 1 bpf_prog_run(prog2) ... contention_begin.active = 0 } do we need bpf program execution in nested tracepoints? we could actually allow 3 nesting levels for this case.. thoughts? thanks, jirka --- diff --git a/include/trace/bpf_probe.h b/include/trace/bpf_probe.h index 6a13220d2d27..5a354ae096e5 100644 --- a/include/trace/bpf_probe.h +++ b/include/trace/bpf_probe.h @@ -78,11 +78,15 @@ #define CAST_TO_U64(...) CONCATENATE(__CAST, COUNT_ARGS(__VA_ARGS__))(__VA_ARGS__) #define __BPF_DECLARE_TRACE(call, proto, args) \ +static DEFINE_PER_CPU(int, __bpf_trace_tp_active_##call); \ static notrace void \ __bpf_trace_##call(void *__data, proto) \ { \ struct bpf_prog *prog = __data; \ - CONCATENATE(bpf_trace_run, COUNT_ARGS(args))(prog, CAST_TO_U64(args)); \ + \ + if (likely(this_cpu_inc_return(__bpf_trace_tp_active_##call) == 1)) \ + CONCATENATE(bpf_trace_run, COUNT_ARGS(args))(prog, CAST_TO_U64(args)); \ + this_cpu_dec(__bpf_trace_tp_active_##call); \ } #undef DECLARE_EVENT_CLASS