On Fri, Dec 06, 2024 at 10:21:18AM -0800, Andrii Nakryiko wrote: > On Fri, Dec 6, 2024 at 9:09 AM Jiri Olsa <olsajiri@xxxxxxxxx> wrote: > > > > On Wed, Oct 23, 2024 at 09:01:02AM -0700, Andrii Nakryiko wrote: > > > On Wed, Oct 23, 2024 at 3:01 AM Jiri Olsa <jolsa@xxxxxxxxxx> wrote: > > > > > > > > Peter reported that perf_event_detach_bpf_prog might skip to release > > > > the bpf program for -ENOENT error from bpf_prog_array_copy. > > > > > > > > This can't happen because bpf program is stored in perf event and is > > > > detached and released only when perf event is freed. > > > > > > > > Let's make it obvious and add WARN_ON_ONCE on the -ENOENT check and > > > > make sure the bpf program is released in any case. > > > > > > > > Cc: Sean Young <sean@xxxxxxxx> > > > > Fixes: 170a7e3ea070 ("bpf: bpf_prog_array_copy() should return -ENOENT if exclude_prog not found") > > > > Closes: https://lore.kernel.org/lkml/20241022111638.GC16066@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ > > > > Reported-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > > > > Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx> > > > > --- > > > > kernel/trace/bpf_trace.c | 5 +++-- > > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c > > > > index 95b6b3b16bac..2c064ba7b0bd 100644 > > > > --- a/kernel/trace/bpf_trace.c > > > > +++ b/kernel/trace/bpf_trace.c > > > > @@ -2216,8 +2216,8 @@ void perf_event_detach_bpf_prog(struct perf_event *event) > > > > > > > > old_array = bpf_event_rcu_dereference(event->tp_event->prog_array); > > > > ret = bpf_prog_array_copy(old_array, event->prog, NULL, 0, &new_array); > > > > - if (ret == -ENOENT) > > > > - goto unlock; > > > > + if (WARN_ON_ONCE(ret == -ENOENT)) > > > > + goto put; > > > > if (ret < 0) { > > > > bpf_prog_array_delete_safe(old_array, event->prog); > > > > > > seeing > > > > > > if (ret < 0) > > > bpf_prog_array_delete_safe(old_array, event->prog); > > > > > > I think neither ret == -ENOENT nor WARN_ON_ONCE is necessary, tbh. So > > > now I feel like just dropping WARN_ON_ONCE() is better. > > > > hi, > > there's syzbot report [1] where we could end up with following > > > > - create perf event and set bpf program to it > > - clone process -> create inherited event > > - exit -> release both events > > - first perf_event_detach_bpf_prog call will release tp_event->prog_array > > and second perf_event_detach_bpf_prog will crash because > > tp_event->prog_array is NULL > > > > we can fix that quicly with change below, I guess we could add refcount > > to bpf_prog_array_item and allow one of the parent/inherited events to > > work while the other is gone.. but that might be too much, will check > > > > jirka > > > > > > [1] https://lore.kernel.org/bpf/Z1MR6dCIKajNS6nU@krava/T/#m91dbf0688221ec7a7fc95e896a7ef9ff93b0b8ad > > --- > > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c > > index fe57dfbf2a86..d4b45543ebc2 100644 > > --- a/kernel/trace/bpf_trace.c > > +++ b/kernel/trace/bpf_trace.c > > @@ -2251,6 +2251,8 @@ void perf_event_detach_bpf_prog(struct perf_event *event) > > goto unlock; > > > > old_array = bpf_event_rcu_dereference(event->tp_event->prog_array); > > + if (!old_array) > > + goto put; > > How does this inherited event stuff work? You can have two separate > events sharing the same prog_array? What if we attach different > programs to each of those events, will both of them be called for > either of two events? That sounds broken, if that's true. so perf event with attr.inherit=1 attached on task will get inherited by child process.. the new child event shares the parent's bpf program and tp_event (hence prog_array) which is global for tracepoint AFAICS when child process exits the inherited event is destroyed and it removes related tp_event->prog_array, so the parent event won't trigger ever again, the test below shows that test_tp_attach:FAIL:executed unexpected executed: actual 1 != expected 2 I'm not sure this is problem in practise, because nobody complained about that ;-) libbpf does not set attr.inherit=1 and creates system wide perf event, so no problem there jirka --- diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 66173ddb5a2d..2e96241b5030 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -12430,8 +12430,9 @@ static int perf_event_open_tracepoint(const char *tp_category, attr.type = PERF_TYPE_TRACEPOINT; attr.size = attr_sz; attr.config = tp_id; + attr.inherit = 1; - pfd = syscall(__NR_perf_event_open, &attr, -1 /* pid */, 0 /* cpu */, + pfd = syscall(__NR_perf_event_open, &attr, 0 /* pid */, 0 /* cpu */, -1 /* group_fd */, PERF_FLAG_FD_CLOEXEC); if (pfd < 0) { err = -errno; diff --git a/tools/testing/selftests/bpf/prog_tests/tp_attach.c b/tools/testing/selftests/bpf/prog_tests/tp_attach.c new file mode 100644 index 000000000000..01bbf1d1ab52 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/tp_attach.c @@ -0,0 +1,35 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <test_progs.h> +#include "tp_attach.skel.h" + +void test_tp_attach(void) +{ + struct tp_attach *skel; + int pid; + + skel = tp_attach__open_and_load(); + if (!ASSERT_OK_PTR(skel, "tp_attach__open_and_load")) + return; + + skel->bss->pid = getpid(); + + if (!ASSERT_OK(tp_attach__attach(skel), "tp_attach__attach")) + goto out; + + getpid(); + + pid = fork(); + if (!ASSERT_GE(pid, 0, "fork")) + goto out; + if (pid == 0) + _exit(0); + waitpid(pid, NULL, 0); + + getpid(); + + ASSERT_EQ(skel->bss->executed, 2, "executed"); + +out: + tp_attach__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/tp_attach.c b/tools/testing/selftests/bpf/progs/tp_attach.c new file mode 100644 index 000000000000..d9450d2eac17 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/tp_attach.c @@ -0,0 +1,17 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <vmlinux.h> +#include <bpf/bpf_tracing.h> + +char _license[] SEC("license") = "GPL"; + +int pid; +int executed; + +SEC("tp/syscalls/sys_enter_getpid") +int test(void *ctx) +{ + if (pid == (bpf_get_current_pid_tgid() >> 32)) + executed++; + return 0; +}