On Mon, Nov 7, 2022 at 4:31 AM Jiri Olsa <olsajiri@xxxxxxxxx> wrote: > > On Wed, Nov 02, 2022 at 03:28:47PM +0100, Jiri Olsa wrote: > > On Thu, Oct 27, 2022 at 07:45:16PM +0800, Hao Sun wrote: > > > Jiri Olsa <olsajiri@xxxxxxxxx> 于2022年10月27日周四 19:24写道: > > > > > > > > On Thu, Oct 27, 2022 at 10:27:28AM +0800, Hao Sun wrote: > > > > > Hi, > > > > > > > > > > The following warning can be triggered with the C reproducer in the link. > > > > > Syzbot also reported this several days ago, Jiri posted a patch that > > > > > uses bpf prog `active` field to fix this by 05b24ff9b2cfab (bpf: > > > > > Prevent bpf program recursion...) according to syzbot dashboard > > > > > (https://syzkaller.appspot.com/bug?id=179313fb375161d50a98311a28b8e2fc5f7350f9). > > > > > But this warning can still be triggered on 247f34f7b803 > > > > > (Linux-v6.1-rc2) that already merged the patch, so it seems that this > > > > > still is an issue. > > > > > > > > > > HEAD commit: 247f34f7b803 Linux 6.1-rc2 > > > > > git tree: upstream > > > > > console output: https://pastebin.com/raw/kNw8JCu5 > > > > > kernel config: https://pastebin.com/raw/sE5QK5HL > > > > > C reproducer: https://pastebin.com/raw/X96ASi27 > > > > > > > > hi, > > > > right, that fix addressed that issue for single bpf program, > > > > and it won't prevent if there are multiple programs hook on > > > > contention_begin tracepoint and calling bpf_trace_printk, > > > > > > > > I'm not sure we can do something there.. will check > > > > > > > > do you run just the reproducer, or you load the server somehow? > > > > I cannot hit the issue so far > > > > > > > > > > Hi, > > > > > > Last email has format issues, resend it here. > > > > > > I built the kernel with the config in the link, which contains > > > “CONFIG_CMDLINE="earlyprintk=serial net.ifnames=0 > > > sysctl.kernel.hung_task_all_cpu_backtrace=1 panic_on_warn=1 …”, and > > > boot the kernel with normal qemu setup and then the warning can be > > > triggered by executing the reproducer. > > > > > > Also, I’m willing to test the proposed patch if any. > > > > fyi I reproduced that.. will check if we can do anything about that > > I reproduced this with set of 8 programs all hooked to contention_begin > tracepoint and here's what I think is happening: > > all programs (prog1 .. prog8) call just bpf_trace_printk helper and I'm > running 'perf bench sched messaging' to load the machine > > at some point some contended lock triggers trace_contention_begin: > > trace_contention_begin > __traceiter_contention_begin <-- iterates all functions attached to tracepoint > __bpf_trace_run(prog1) > prog1->active = 1 > bpf_prog_run(prog1) > bpf_trace_printk > bpf_bprintf_prepare <-- takes buffer 1 out of 3 > raw_spin_lock_irqsave(trace_printk_lock) > > # we have global single trace_printk_lock, so we will trigger > # its trace_contention_begin at some point > > trace_contention_begin > __traceiter_contention_begin > __bpf_trace_run(prog1) > prog1->active block <-- prog1 is already 'running', skipping the execution > __bpf_trace_run(prog2) > prog2->active = 1 > bpf_prog_run(prog2) > bpf_trace_printk > bpf_bprintf_prepare <-- takes buffer 2 out of 3 > raw_spin_lock_irqsave(trace_printk_lock) > trace_contention_begin > __traceiter_contention_begin > __bpf_trace_run(prog1) > prog1->active block <-- prog1 is already 'running', skipping the execution > __bpf_trace_run(prog2) > prog2->active block <-- prog2 is already 'running', skipping the execution > __bpf_trace_run(prog3) > prog3->active = 1 > bpf_prog_run(prog3) > bpf_trace_printk > bpf_bprintf_prepare <-- takes buffer 3 out of 3 > raw_spin_lock_irqsave(trace_printk_lock) > trace_contention_begin > __traceiter_contention_begin > __bpf_trace_run(prog1) > prog1->active block <-- prog1 is already 'running', skipping the execution > __bpf_trace_run(prog2) > prog2->active block <-- prog2 is already 'running', skipping the execution > __bpf_trace_run(prog3) > prog3->active block <-- prog3 is already 'running', skipping the execution > __bpf_trace_run(prog4) > prog4->active = 1 > bpf_prog_run(prog4) > bpf_trace_printk > bpf_bprintf_prepare <-- tries to take buffer 4 out of 3 -> WARNING > > > the code path may vary based on the contention of the trace_printk_lock, > so I saw different nesting within 8 programs, but all eventually ended up > at 4 levels of nesting and hit the warning > > I think we could perhaps move the 'active' flag protection from program > to the tracepoint level (in the patch below), to prevent nesting execution > of the same tracepoint, so it'd look like: > > trace_contention_begin > __traceiter_contention_begin > __bpf_trace_run(prog1) { > contention_begin.active = 1 > bpf_prog_run(prog1) > bpf_trace_printk > bpf_bprintf_prepare > raw_spin_lock_irqsave(trace_printk_lock) > trace_contention_begin > __traceiter_contention_begin > __bpf_trace_run(prog1) > blocked because contention_begin.active == 1 > __bpf_trace_run(prog2) > blocked because contention_begin.active == 1 > __bpf_trace_run(prog3) > ... > __bpf_trace_run(prog8) > blocked because contention_begin.active == 1 > > raw_spin_unlock_irqrestore > bpf_bprintf_cleanup > > contention_begin.active = 0 > } > > __bpf_trace_run(prog2) { > contention_begin.active = 1 > bpf_prog_run(prog2) > ... > contention_begin.active = 0 > } > > do we need bpf program execution in nested tracepoints? > we could actually allow 3 nesting levels for this case.. thoughts? > > thanks, > jirka > > > --- > diff --git a/include/trace/bpf_probe.h b/include/trace/bpf_probe.h > index 6a13220d2d27..5a354ae096e5 100644 > --- a/include/trace/bpf_probe.h > +++ b/include/trace/bpf_probe.h > @@ -78,11 +78,15 @@ > #define CAST_TO_U64(...) CONCATENATE(__CAST, COUNT_ARGS(__VA_ARGS__))(__VA_ARGS__) > > #define __BPF_DECLARE_TRACE(call, proto, args) \ > +static DEFINE_PER_CPU(int, __bpf_trace_tp_active_##call); \ > static notrace void \ > __bpf_trace_##call(void *__data, proto) \ > { \ > struct bpf_prog *prog = __data; \ > - CONCATENATE(bpf_trace_run, COUNT_ARGS(args))(prog, CAST_TO_U64(args)); \ > + \ > + if (likely(this_cpu_inc_return(__bpf_trace_tp_active_##call) == 1)) \ > + CONCATENATE(bpf_trace_run, COUNT_ARGS(args))(prog, CAST_TO_U64(args)); \ > + this_cpu_dec(__bpf_trace_tp_active_##call); \ > } This approach will hurt real use cases where multiple and different raw_tp progs run on the same cpu. Instead let's disallow attaching to trace_contention and potentially any other hook with similar recursion properties. Another option is to add a recursion check to trace_contention itself.