Re: [PATCH v3 bpf-next 7/9] libbpf: implement __arg_ctx fallback logic

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Thu, 4 Jan 2024 12:58:23 -0800

On Thu, Jan 4, 2024 at 10:52 AM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Thu, Jan 4, 2024 at 10:37 AM Andrii Nakryiko
> <andrii.nakryiko@xxxxxxxxx> wrote:
> >
> > On Wed, Jan 3, 2024 at 9:39 PM Alexei Starovoitov
> > <alexei.starovoitov@xxxxxxxxx> wrote:
> > >
> > > On Wed, Jan 3, 2024 at 5:39 PM Andrii Nakryiko <andrii@xxxxxxxxxx> wrote:
> > > >
> > > > This limitation was the reason to add btf_decl_tag("arg:ctx"), making
> > > > the actual argument type not important, so that user can just define
> > > > "generic" signature:
> > > >
> > > >   __noinline int global_subprog(void *ctx __arg_ctx) { ... }
> > >
> > > I still think that this __arg_ctx only makes sense with 'void *'.
> > > Blind rewrite of ctx is a foot gun.
> > >
> > > I've tried the following:
> > >
> > > diff --git a/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c
> > > b/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c
> > > index 9a06e5eb1fbe..0e5f5205d4a8 100644
> > > --- a/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c
> > > +++ b/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c
> > > @@ -106,9 +106,9 @@ int perf_event_ctx(void *ctx)
> > >  /* this global subprog can be now called from many types of entry progs, each
> > >   * with different context type
> > >   */
> > > -__weak int subprog_ctx_tag(void *ctx __arg_ctx)
> > > +__weak int subprog_ctx_tag(long ctx __arg_ctx)
> > >  {
> > > -       return bpf_get_stack(ctx, stack, sizeof(stack), 0);
> > > +       return bpf_get_stack((void *)ctx, stack, sizeof(stack), 0);
> > >  }
> > >
> > >  struct my_struct { int x; };
> > > @@ -131,7 +131,7 @@ int arg_tag_ctx_raw_tp(void *ctx)
> > >  {
> > >         struct my_struct x = { .x = 123 };
> > >
> > > -       return subprog_ctx_tag(ctx) + subprog_multi_ctx_tags(ctx, &x, ctx);
> > > +       return subprog_ctx_tag((long)ctx) +
> > > subprog_multi_ctx_tags(ctx, &x, ctx);
> > >  }
> > >
> > > and it "works".
> >
> > Yeah, but you had to actively force casting everywhere *and* you still
> > had to consciously add __arg_ctx, right? If a user wants to subvert
> > the type system, they will do it. It's C, after all. But if they just
> > accidentally use sk_buff ctx and call it from the XDP program with
> > xdp_buff/xdp_md, the compiler will call out type mismatch.
>
> I could have used long everywhere and avoided casts.
>

My point was that it's hard to accidentally forget to "generalize"
type if you were supporting sk_buff, and suddenly started calling it
with xdp_md.

>From my POV, if I'm a user, and I declare an argument as long and
annotate it as __arg_ctx, then I know what I'm doing and I'd hate for
some smart-ass library to double-guess me dictating what exact
incantation I should specify to make it happy.

If I'm clueless and just randomly sprinkling __arg_ctx, then I have
bigger problems than type mismatch.

> > >
> > > Both kernel and libbpf should really limit it to 'void *'.
> > >
> > > In the other email I suggested to allow types that match expected
> > > based on prog type, but even that is probably a danger zone as well.
> > > The correct type would already be detected by the verifier,
> > > so extra __arg_ctx is pointless.
> > > It makes sense only for such polymorphic functions and those
> > > better use 'void *' and don't dereference it.
> > >
> > > I think this can be a follow up.
> >
> > Not really just polymorphic functions. Think about subprog
> > specifically for the fentry program, as one example. You *need*
> > __arg_ctx just to make context passing work, but you also want
> > non-`void *` type to access arguments.
> >
> > int subprog(u64 *args __arg_ctx) { ... }
> >
> > SEC("fentry")
> > int BPF_PROG(main_prog, ...)
> > {
> >     return subprog(ctx);
> > }
> >
> > Similarly, tracepoint programs, you'd have:
> >
> > int subprog(struct syscall_trace_enter* ctx __arg_ctx) { ... }
> >
> > SEC("tracepoint/syscalls/sys_enter_kill")
> > int main_prog(struct syscall_trace_enter* ctx)
> > {
> >     return subprog(ctx);
> > }
> >
> > So that's one group of cases.
>
> But the above two are not supported by libbpf
> since it doesn't handle "tracing" and "tracepoint" prog types
> in global_ctx_map.

Ok, so I'm confused now. I thought we were talking about both
kernel-side and libbpf-side extra checks.

Look, I don't want libbpf to be too smart and actually cause
unnecessary problems for users (pt_regs being one such case, see
below), and making users do work arounds just to satisfy libbpf. Like
passing `void * ctx __arg_ctx`, but then casting to `struct pt_regs`,
for example. (see below about pt_regs)

Sure, if someone has no clue what they are doing and specifies a
different type, I think it's acceptable for them to have that bug.
They will debug it, fix it, learn something, and won't do it again.
I'd rather assume users know what they are doing rather than
double-guess what they are doing.

If we are talking about libbpf-only changes just for those types that
libbpf is rewriting, fine (though I'm still not happy about struct
pt_regs case not working), we can add it. If Eduard concurs, I'll add
it, it's not hard. But as I said, I think libbpf would be doing
something that it's not supposed to do here (libbpf is just silently
adding an annotation, effectively, it's not changing how code is
generated or how verifier is interpreting types).

If we are talking about kernel-side extra checks, I propose we do that
on my next patch set adding PTR_TO_BTF_ID, but again, we need to keep
those non-polymorphic valid cases in mind (u64 *ctx for fentry,
tracepoint structs, etc) and not make them unnecessarily painful.

> I suspect the kernel sort-of supports above, but in a dangerous
> and broken way.
>
> My point is that users must not use __arg_ctx in these two cases.
> fentry (tracing prog type) wants 'void *' in the kernel to
> match to ctx.
> So the existing mechanism (prior to arg_ctx in the kernel)
> should already work.

Let's unpack. fentry doesn't "want" `void *`, it just doesn't support
passing context argument to global subprog. So you would have to
specify __arg_ctx, and that will only work on recent enough kernels.

At that point, all of `long ctx __arg_ctx`, `void *ctx __arg_ctx` and
`u64 *ctx __arg_ctx` will work. Yes, `long ctx` out of those 3 are
weird, but verifier will treat it as PTR_TO_CTX regardless of specific
type correctly.

More importantly, I'm saying that both `void *ctx __arg_ctx` and `u64
*ctx __arg_ctx` should work for fentry, don't you agree?

>
> > Another special case are networking programs, where both "__sk_buff"
> > and "sk_buff" are allowed, same for "xdp_buff" and "xdp_md".
>
> what do you mean both?
> networking bpf prog must only use __sk_buff and that is one and
> only supported ctx.
> Using 'struct sk_buff *ctx __arg_ctx' will be a bad bug.
> Since offsets will be all wrong while ctx rewrite will apply garbage
> and will likely fail.

You are right about wrong offsets, but the kernel does allow it. See
[0]. I actually tried, and indeed, it allows sk_buff to denote
"context". Note that I had to comment out skb->len dereference
(otherwise verifier will correctly complain about wrong offset), but
it is recognized as PTR_TO_CTX and I could technically pass it to
another subprog or helpers/kfuncs (and that would work).

  [0] https://lore.kernel.org/all/20230301154953.641654-2-joannelkoong@xxxxxxxxx/

diff --git a/tools/testing/selftests/bpf/progs/test_global_func2.c
b/tools/testing/selftests/bpf/progs/test_global_func2.c
index 2beab9c3b68a..29d7f3e78f8e 100644
--- a/tools/testing/selftests/bpf/progs/test_global_func2.c
+++ b/tools/testing/selftests/bpf/progs/test_global_func2.c
@@ -1,16 +1,15 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /* Copyright (c) 2020 Facebook */
-#include <stddef.h>
-#include <linux/bpf.h>
+#include "vmlinux.h"
 #include <bpf/bpf_helpers.h>
 #include "bpf_misc.h"

 #define MAX_STACK (512 - 3 * 32)

 static __attribute__ ((noinline))
-int f0(int var, struct __sk_buff *skb)
+int f0(int var, struct sk_buff *skb)
 {
-       return skb->len;
+       return 0;
 }

 __attribute__ ((noinline))
@@ -20,7 +19,7 @@ int f1(struct __sk_buff *skb)

        __sink(buf[MAX_STACK - 1]);

-       return f0(0, skb) + skb->len;
+       return f0(0, (void*)skb) + skb->len;
 }

 int f3(int, struct __sk_buff *skb, int);
@@ -45,5 +44,5 @@ SEC("tc")
 __success
 int global_func2(struct __sk_buff *skb)
 {
-       return f0(1, skb) + f1(skb) + f2(2, skb) + f3(3, skb, 4);
+       return f0(1, (void *)skb) + f1(skb) + f2(2, skb) + f3(3, skb, 4);
 }


>
> > Also, kprobes are special, both "struct bpf_user_pt_regs_t" and
> > *typedef* "bpf_user_pt_regs_t" are supported. But in practice users
> > will often just use `struct pt_regs *ctx`, actually.
>
> Same thing. The global bpf prog has to use bpf_user_pt_regs_t
> to be properly recognized as ctx arg type.
> Nothing special. Using 'struct pt_regs * ctx __arg_ctx' and blind
> rewrite will cause similar hard to debug bugs when
> bpf_user_pt_regs_t doesn't match pt_regs that bpf prog sees
> at compile time.

So this is not the same thing as skbuff. If BPF program is meant for a
single architecture, like x86-64, it's completely valid (and that's
what people have been doing with static subprogs for ages now) to just
use `struct pt_regs`. They are the same thing on x86.

I'll say even more, with libbpf's PT_REGS_xxx() macros you don't even
need to know about pt_regs vs user_pt_regs difference, as macros
properly force-cast arguments, depending on architecture. So in your
BPF code you can just pass `struct pt_regs *` around just fine across
multiple architectures as long as you only use PT_REGS_xxx() macros
and then pass that context to helpers (to get stack trace,
bpf_perf_event_output, etc).

No one even knows about bpf_user_pt_regs_t, I had to dig it up from
kernel source code and let users know what exact type name to use for
global subprog.