On Thu, Jan 4, 2024 at 7:58 PM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > On Thu, Jan 4, 2024 at 5:34 PM Alexei Starovoitov > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > On Thu, Jan 4, 2024 at 12:58 PM Andrii Nakryiko > > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > > > > > > My point was that it's hard to accidentally forget to "generalize" > > > type if you were supporting sk_buff, and suddenly started calling it > > > with xdp_md. > > > > > > From my POV, if I'm a user, and I declare an argument as long and > > > annotate it as __arg_ctx, then I know what I'm doing and I'd hate for > > > some smart-ass library to double-guess me dictating what exact > > > incantation I should specify to make it happy. > > > > But that's exactly what's happening! > > The smart-ass libbpf ignores the type in 'struct sk_buff *skb __arg_ctx' > > and replaces it with whatever is appropriate for prog type. > > The only thing that libbpf does in this case is it honors __arg_ctx > and makes it work *exactly the same* as __arg_ctx natively works on > the newest kernel. Not more, not less. It doesn't change compilation > or verification rules. At all. Here in all previous emails I was talking about both kernel and libbpf. Both shouldn't be breaking C rules. Not singling out libbpf. > Validating f1() func#2... > 20: R1=ctx() R10=fp0 > ; int f1(struct sk_buff *skb) > > It's a context. Ohh. Looks like I screwed it up back then. /* only compare that prog's ctx type name is the same as * kernel expects. No need to compare field by field. * It's ok for bpf prog to do: * struct __sk_buff {}; * int socket_filter_bpf_prog(struct __sk_buff *skb) * { // no fields of skb are ever used } */ if (strcmp(ctx_tname, "__sk_buff") == 0 && strcmp(tname, "sk_buff") == 0) return ctx_type; See comment. The intent was to allow __sk_buff in prog to match with __sk_buff in the kernel. Brainfart. > Not really, see below. For a long time *we thought* that kernel > recognizes bpf_user_pt_regs_t, but in reality it wanted `struct > bpf_user_pt_regs_t` which doesn't even exist in kernel and has nothing > common with either `struct pt_regs` or `struct user_pt_regs`. I fixed > that and now the kernel recognizes *both* typedef and struct > bpf_user_pt_regs_t. And there is no point in using typedef, because > `struct bpf_user_pt_regs_t` is backwards compatible and that's what > users actually use in practice. Hmm. The test with __weak int kprobe_typedef_ctx_subprog(bpf_user_pt_regs_t *ctx) was added back in Feb 2023. So it was surely working for the last year. > > __PT_REGS_CAST is arch dependent and typeof makes it seen with > > correct btf_id and the kernel knows it's PTR_TO_CTX. > > TBH, I don't know what btf_id has to do with this, it looks either as > a distraction or subtle point you are making that I'm missing. > __PT_REGS_CAST() just does C language cast, there is no BTF or BTF ID > involved here, so what am I missing? That was your patch :) I'm just pointing out the neat trick with typeof to put the correct type in there, so it's later seen with proper btf_id and recognized as ctx. You added it a year ago. > > Why not? This is what I don't get. Here's a real piece of code to > demonstrate what users do in practice: > > struct bpf_user_pt_regs_t {} > > __hidden int handle_event_user_pt_regs(struct bpf_user_pt_regs_t* ctx) { > if (pyperf_prog_cfg.sample_interval > 0) { > if (__sync_fetch_and_add(&total_events_count, 1) % > pyperf_prog_cfg.sample_interval) { > return 0; > } > } > > return handle_event_helper((struct pt_regs*)ctx, NULL); > } I think you're talking about kernel prior to that commit a year ago that made it possible to drop 'struct'. > I'm saying that I explicitly do want to be able to declare (in general):> > int handle_event_user(struct pt_regs *ctx __arg_ctx) { ...} > > And this would work both on old and new kernels, with and without > native __arg_ctx support. And it will be very close to static subprogs > in the existing code base. > > Why do you want to disallow this artificially? Not artificially, but only when pt_regs in bpf prog doesn't match what kernel is passing. I think allowing only: handle_event_user(void *ctx __arg_ctx) and prog will cast it to pt_regs immediately is less surprising and proper C code, but handle_event_user(struct pt_regs *ctx __arg_ctx) is also ok when pt_regs is indeed what is being passed. Which will be the case for x86. And will be fine on arm64 too, because arch/arm64/include/asm/ptrace.h struct pt_regs { union { struct user_pt_regs user_regs; but if arm64 ever changes that layout we should start failing to load. > All the above is already checked and enforced by the compiler. Libbpf > doesn't subvert it in any way. All that libbpf is doing is saying "ah, > user, you want this argument to be treated as PTR_TO_CTX, right? Too > bad host kernel is a bit too old to understand __arg_ctx natively, but > worry you not, I'll just quickly fix up BTF information that *only > kernel* uses *only to check type name* (nothing else!), and it will > look like kernel actually understood __arg_ctx, that's all, happy > BPF'ing!". and this way libbpf _may_ introduce a hard to debug bug. The same mistake the new kernel _may_ do with __arg_ctx with old libbpf. Both will do a hidden typecast when the bpf prog is potentially written with different type. foo(struct pt_regs *ctx __arg_ctx) Quick git grep shows that it will probably work on all archs except 'arc' where arch/arc/include/uapi/asm/bpf_perf_event.h:typedef struct user_regs_struct bpf_user_pt_regs_t; and struct pt_regs seems to have a different layout than user_regs_struct. But when kernel allows sk_buff to be passed into foo(struct pt_regs *ctx __arg_ctx) is just broken.