Re: [PATCH bpf-next 5/6] bpf: add bpf_test_tp() kfunc triggering tp and allowing error injection

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Mon, 25 Mar 2024 18:43:29 -0700

On Mon, Mar 25, 2024 at 3:19 PM Andrii Nakryiko
<andrii.nakryiko@xxxxxxxxx> wrote:
>
> On Mon, Mar 25, 2024 at 10:37 AM Alexei Starovoitov
> <alexei.starovoitov@xxxxxxxxx> wrote:
> >
> > On Thu, Mar 21, 2024 at 5:01 PM Andrii Nakryiko <andrii@xxxxxxxxxx> wrote:
> > >
> > > Add a simple bpf_test_tp() kfunc, available to all program types, that
> > > is useful for various testing and benchmarking scenarios, as it allows
> > > to trigger most tracing BPF program types from BPF side, allowing to do
> > > complex testing and benchmarking scenarios.
> > >
> > > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
> > > ---
> > >  kernel/bpf/bpf_test.h | 34 ++++++++++++++++++++++++++++++++++
> > >  kernel/bpf/helpers.c  | 13 +++++++++++++
> > >  2 files changed, 47 insertions(+)
> > >  create mode 100644 kernel/bpf/bpf_test.h
> > >
> > > diff --git a/kernel/bpf/bpf_test.h b/kernel/bpf/bpf_test.h
> > > new file mode 100644
> > > index 000000000000..c779927d3574
> > > --- /dev/null
> > > +++ b/kernel/bpf/bpf_test.h
> > > @@ -0,0 +1,34 @@
> > > +/* SPDX-License-Identifier: GPL-2.0 */
> > > +#undef TRACE_SYSTEM
> > > +#define TRACE_SYSTEM bpf_test
> > > +
> > > +#if !defined(_TRACE_BPF_TEST_H) || defined(TRACE_HEADER_MULTI_READ)
> > > +
> > > +#define _TRACE_BPF_TEST_H
> > > +
> > > +#include <linux/tracepoint.h>
> > > +
> > > +TRACE_EVENT(bpf_test,
> > > +
> > > +       TP_PROTO(int nonce),
> > > +
> > > +       TP_ARGS(nonce),
> > > +
> > > +       TP_STRUCT__entry(
> > > +               __field(int, nonce)
> > > +       ),
> > > +
> > > +       TP_fast_assign(
> > > +               __entry->nonce = nonce;
> > > +       ),
> > > +
> > > +       TP_printk("nonce %d", __entry->nonce)
> > > +);
> > > +
> > > +#endif /* _TRACE_BPF_TEST_H */
> > > +
> > > +#undef TRACE_INCLUDE_PATH
> > > +#define TRACE_INCLUDE_PATH .
> > > +#define TRACE_INCLUDE_FILE bpf_test
> > > +
> > > +#include <trace/define_trace.h>
> > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > > index 9234174ccb21..67bf9659c447 100644
> > > --- a/kernel/bpf/helpers.c
> > > +++ b/kernel/bpf/helpers.c
> > > @@ -26,6 +26,10 @@
> > >
> > >  #include "../../lib/kstrtox.h"
> > >
> > > +#define CREATE_TRACE_POINTS
> > > +#include "bpf_test.h"
> > > +
> > > +
> > >  /* If kernel subsystem is allowing eBPF programs to call this function,
> > >   * inside its own verifier_ops->get_func_proto() callback it should return
> > >   * bpf_map_lookup_elem_proto, so that verifier can properly check the arguments
> > > @@ -2549,6 +2553,14 @@ __bpf_kfunc void bpf_throw(u64 cookie)
> > >         WARN(1, "A call to BPF exception callback should never return\n");
> > >  }
> > >
> > > +__bpf_kfunc int bpf_test_tp(int nonce)
> > > +{
> > > +       trace_bpf_test(nonce);
> > > +
> > > +       return nonce;
> > > +}
> > > +ALLOW_ERROR_INJECTION(bpf_test_tp, ERRNO);
> >
> > There is a better way with register_btf_fmodret_id_set().
> > Error injection can be disabled in config.
>
> ok, didn't know about it
>
> >
> > Also we have two such test funcs already:
> > bpf_modify_return_test*
> > and they can be exercised by bpf_prog_test_run_tracing.
> > Let's make it recognize 'repeat' argument and call
> > bpf_modify_return_test or 3rd such func multiple times ?
>
> I'll add bpf_modify_return_test_tp() to not touch all the tests using
> bpf_modify_return_test() and bpf_modify_return_test2(), if that's ok.
> Existing ones expect some memory pointer, dereference it, etc, it
> seems cleaner to have a dedicated tp-triggering one for this.
>
> > Exercise of test tracepoint can be there as well.
> > Asking bpf prog to call a kfunc to call a tracepoint
> > looks like extra hop.
> > Existing test_run facility should be able to accommodate.
>
> You mean if I add bpf_modify_return_test_tp() above, I should pass an
> argument of how many times that tracepoint should be triggered? Or you
> mean to use test_run's repeat argument to trigger "driver program" N
> times, and the driver program would just call
> bpf_modify_return_test_tp() once? If the latter, I'm not sure it's the
> same as calling the driver program once and doing a loop inside, as
> we'll measure more of calling driver program overhead (N times vs 1
> time right now, per each N tp/fmod_ret calls).
>
> (But tbh, not having to use test_run's repeat functionality is a
> benefit, IMO, we can have more flexible counting/timing code and
> whatever else we might need, I'm not sure why using test_run's repeat
> is advantageous here)
>
> Not sure what you are trying to optimize for here, please clarify.
>

So I currently have these changes. I moved tp into bpf_test_run.h
(didn't know we have that, this should eliminate the issue that Jiri
saw as well). Moved kfunc into net/bpf/test_run.c and renamed it to
bpf_modify_return_test_tp(), but I kept it referenced in "common
kfuncs", so that raw_tp and others could call it (doesn't seem like I
need extern declaration for it).\

I just didn't touch the test_run functionality. All this works in
bench tool, which is nice.

diff --git a/include/trace/events/bpf_test_run.h
b/include/trace/events/bpf_test_run.h
index 265447e3f71a..0c924d39b7cb 100644
--- a/include/trace/events/bpf_test_run.h
+++ b/include/trace/events/bpf_test_run.h
@@ -7,6 +7,23 @@

 #include <linux/tracepoint.h>

+TRACE_EVENT(bpf_trigger_tp,
+
+       TP_PROTO(int nonce),
+
+       TP_ARGS(nonce),
+
+       TP_STRUCT__entry(
+               __field(int, nonce)
+       ),
+
+       TP_fast_assign(
+               __entry->nonce = nonce;
+       ),
+
+       TP_printk("nonce %d", __entry->nonce)
+);
+
 DECLARE_EVENT_CLASS(bpf_test_finish,

        TP_PROTO(int *err),
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 67bf9659c447..a85dc684450a 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2553,14 +2553,6 @@ __bpf_kfunc void bpf_throw(u64 cookie)
        WARN(1, "A call to BPF exception callback should never return\n");
 }

-__bpf_kfunc int bpf_test_tp(int nonce)
-{
-       trace_bpf_test(nonce);
-
-       return nonce;
-}
-ALLOW_ERROR_INJECTION(bpf_test_tp, ERRNO);
-
 __bpf_kfunc_end_defs();

 BTF_KFUNCS_START(generic_btf_ids)
@@ -2637,7 +2629,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_is_null)
 BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly)
 BTF_ID_FLAGS(func, bpf_dynptr_size)
 BTF_ID_FLAGS(func, bpf_dynptr_clone)
-BTF_ID_FLAGS(func, bpf_test_tp)
+BTF_ID_FLAGS(func, bpf_modify_return_test_tp)
 BTF_KFUNCS_END(common_btf_ids)

 static const struct btf_kfunc_id_set common_kfunc_set = {
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 61efeadaff8d..f6aad4ed2ab2 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -575,6 +575,13 @@ __bpf_kfunc int bpf_modify_return_test2(int a,
int *b, short c, int d,
        return a + *b + c + d + (long)e + f + g;
 }

+__bpf_kfunc int bpf_modify_return_test_tp(int nonce)
+{
+       trace_bpf_trigger_tp(nonce);
+
+       return nonce;
+}
+
 int noinline bpf_fentry_shadow_test(int a)
 {
        return a + 1;
@@ -622,6 +629,7 @@ __bpf_kfunc_end_defs();
 BTF_KFUNCS_START(bpf_test_modify_return_ids)
 BTF_ID_FLAGS(func, bpf_modify_return_test)
 BTF_ID_FLAGS(func, bpf_modify_return_test2)
+BTF_ID_FLAGS(func, bpf_modify_return_test_tp)
 BTF_ID_FLAGS(func, bpf_fentry_test1, KF_SLEEPABLE)
 BTF_KFUNCS_END(bpf_test_modify_return_ids)

diff --git a/tools/testing/selftests/bpf/progs/trigger_bench.c
b/tools/testing/selftests/bpf/progs/trigger_bench.c
index 914ed625441a..2619ed193c65 100644
--- a/tools/testing/selftests/bpf/progs/trigger_bench.c
+++ b/tools/testing/selftests/bpf/progs/trigger_bench.c
@@ -56,7 +56,7 @@ int trigger_driver(void *ctx)
        return 0;
 }

-extern int bpf_test_tp(int nonce) __ksym __weak;
+extern int bpf_modify_return_test_tp(int nonce) __ksym __weak;

 SEC("?raw_tp")
 int trigger_driver_kfunc(void *ctx)
@@ -64,7 +64,7 @@ int trigger_driver_kfunc(void *ctx)
        int i;

        for (i = 0; i < batch_iters; i++)
-               (void)bpf_test_tp(0); /* attach point for benchmarking */
+               (void)bpf_modify_return_test_tp(0); /* attach point
for benchmarking */

        return 0;
 }
@@ -111,21 +111,21 @@ int bench_trigger_fexit(void *ctx)
        return 0;
 }

-SEC("?fmod_ret/bpf_test_tp")
+SEC("?fmod_ret/bpf_modify_return_test_tp")
 int bench_trigger_fmodret(void *ctx)
 {
        inc_counter();
        return -22;
 }

-SEC("?tp/bpf_test/bpf_test")
+SEC("?tp/bpf_test_run/bpf_trigger_tp")
 int bench_trigger_tp(void *ctx)
 {
        inc_counter();
        return 0;
 }

-SEC("?raw_tp/bpf_test")
+SEC("?raw_tp/bpf_trigger_tp")
 int bench_trigger_rawtp(void *ctx)
 {
        inc_counter();

> >
> > I still don't get it how modifying the kernel is better than
> > adding all that to a kernel module.
> > When you want to run bench tool on a different server
> > you either need to build/copy bpf_testmod or
> > build/copy the whole kernel.
> > But fine.
>
> I optimize for having a single pre-built bench binary that one can scp
> to multiple different (potentially custom built) kernels and run
> benchmarks. Of course on old kernels where we don't yet have this new
> bpf_modify_return_test_tp() kfunc some of benchmarks won't work
> (raw_tp/fmod_ret/tp only), but that's fine, as newer kernels get
> deployed we'll eventually get them even in production. kprobe/fentry
> benchmark will work as is (which is why I'm not switching them to new
> kfunc, so we can run them on a wider variety of kernels).
>
> > All the test things in net/bpf/test_run.c will stay.