Re: [PATCH v3 bpf-next 7/9] libbpf: implement __arg_ctx fallback logic

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Thu, 4 Jan 2024 21:42:39 -0800

On Thu, Jan 4, 2024 at 7:58 PM Andrii Nakryiko
<andrii.nakryiko@xxxxxxxxx> wrote:
>
> On Thu, Jan 4, 2024 at 5:34 PM Alexei Starovoitov
> <alexei.starovoitov@xxxxxxxxx> wrote:
> >
> > On Thu, Jan 4, 2024 at 12:58 PM Andrii Nakryiko
> > <andrii.nakryiko@xxxxxxxxx> wrote:
> > >
> > >
> > > My point was that it's hard to accidentally forget to "generalize"
> > > type if you were supporting sk_buff, and suddenly started calling it
> > > with xdp_md.
> > >
> > > From my POV, if I'm a user, and I declare an argument as long and
> > > annotate it as __arg_ctx, then I know what I'm doing and I'd hate for
> > > some smart-ass library to double-guess me dictating what exact
> > > incantation I should specify to make it happy.
> >
> > But that's exactly what's happening!
> > The smart-ass libbpf ignores the type in 'struct sk_buff *skb __arg_ctx'
> > and replaces it with whatever is appropriate for prog type.
>
> The only thing that libbpf does in this case is it honors __arg_ctx
> and makes it work *exactly the same* as __arg_ctx natively works on
> the newest kernel. Not more, not less. It doesn't change compilation
> or verification rules. At all.

Here in all previous emails I was talking about both kernel and libbpf.
Both shouldn't be breaking C rules.
Not singling out libbpf.

> Validating f1() func#2...
> 20: R1=ctx() R10=fp0
> ; int f1(struct sk_buff *skb)
>
> It's a context.

Ohh. Looks like I screwed it up back then.
        /* only compare that prog's ctx type name is the same as
         * kernel expects. No need to compare field by field.
         * It's ok for bpf prog to do:
         * struct __sk_buff {};
         * int socket_filter_bpf_prog(struct __sk_buff *skb)
         * { // no fields of skb are ever used }
         */
        if (strcmp(ctx_tname, "__sk_buff") == 0 && strcmp(tname,
"sk_buff") == 0)
                return ctx_type;

See comment. The intent was to allow __sk_buff in prog to
match with __sk_buff in the kernel.
Brainfart.

> Not really, see below. For a long time *we thought* that kernel
> recognizes bpf_user_pt_regs_t, but in reality it wanted `struct
> bpf_user_pt_regs_t` which doesn't even exist in kernel and has nothing
> common with either `struct pt_regs` or `struct user_pt_regs`. I fixed
> that and now the kernel recognizes *both* typedef and struct
> bpf_user_pt_regs_t. And there is no point in using typedef, because
> `struct bpf_user_pt_regs_t` is backwards compatible and that's what
> users actually use in practice.

Hmm.
The test with
__weak int kprobe_typedef_ctx_subprog(bpf_user_pt_regs_t *ctx)

was added back in Feb 2023.
So it was surely working for the last year.

> > __PT_REGS_CAST is arch dependent and typeof makes it seen with
> > correct btf_id and the kernel knows it's PTR_TO_CTX.
>
> TBH, I don't know what btf_id has to do with this, it looks either as
> a distraction or subtle point you are making that I'm missing.
> __PT_REGS_CAST() just does C language cast, there is no BTF or BTF ID
> involved here, so what am I missing?

That was your patch :)
I'm just pointing out the neat trick with typeof to put
the correct type in there,
so it's later seen with proper btf_id and recognized as ctx.
You added it a year ago.

>
> Why not? This is what I don't get. Here's a real piece of code to
> demonstrate what users do in practice:
>
> struct bpf_user_pt_regs_t {}
>
> __hidden int handle_event_user_pt_regs(struct bpf_user_pt_regs_t* ctx) {
>   if (pyperf_prog_cfg.sample_interval > 0) {
>     if (__sync_fetch_and_add(&total_events_count, 1) %
>         pyperf_prog_cfg.sample_interval) {
>       return 0;
>     }
>   }
>
>   return handle_event_helper((struct pt_regs*)ctx, NULL);
> }

I think you're talking about kernel prior to that commit a year ago
that made it possible to drop 'struct'.

> I'm saying that I explicitly do want to be able to declare (in general):>
> int handle_event_user(struct pt_regs *ctx __arg_ctx) { ...}
>
> And this would work both on old and new kernels, with and without
> native __arg_ctx support. And it will be very close to static subprogs
> in the existing code base.
>
> Why do you want to disallow this artificially?

Not artificially, but only when pt_regs in bpf prog doesn't match
what kernel is passing.
I think allowing only:
  handle_event_user(void *ctx __arg_ctx)
and prog will cast it to pt_regs immediately is less surprising
and proper C code,
but
  handle_event_user(struct pt_regs *ctx __arg_ctx)
is also ok when pt_regs is indeed what is being passed.
Which will be the case for x86.
And will be fine on arm64 too, because
arch/arm64/include/asm/ptrace.h
struct pt_regs {
        union {
                struct user_pt_regs user_regs;

but if arm64 ever changes that layout we should start failing to load.

> All the above is already checked and enforced by the compiler. Libbpf
> doesn't subvert it in any way. All that libbpf is doing is saying "ah,
> user, you want this argument to be treated as PTR_TO_CTX, right? Too
> bad host kernel is a bit too old to understand __arg_ctx natively, but
> worry you not, I'll just quickly fix up BTF information that *only
> kernel* uses *only to check type name* (nothing else!), and it will
> look like kernel actually understood __arg_ctx, that's all, happy
> BPF'ing!".

and this way libbpf _may_ introduce a hard to debug bug.
The same mistake the new kernel _may_ do with __arg_ctx with old libbpf.
Both will do a hidden typecast when the bpf prog is
potentially written with different type.

foo(struct pt_regs *ctx __arg_ctx)
Quick git grep shows that it will probably work on all archs
except 'arc' where
arch/arc/include/uapi/asm/bpf_perf_event.h:typedef struct
user_regs_struct bpf_user_pt_regs_t;
and struct pt_regs seems to have a different layout than user_regs_struct.

But when kernel allows sk_buff to be passed into
foo(struct pt_regs *ctx __arg_ctx)
is just broken.