On Fri, Dec 6, 2024 at 8:20 AM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Fri, Dec 6, 2024 at 8:13 AM Andrii Nakryiko > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > On Fri, Dec 6, 2024 at 8:08 AM Alexei Starovoitov > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > On Thu, Dec 5, 2024 at 10:23 PM Andrii Nakryiko > > > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > > > > > On Thu, Dec 5, 2024 at 8:07 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > > > > > > > > > On Thu, 2024-12-05 at 17:44 -0800, Alexei Starovoitov wrote: > > > > > > On Thu, Dec 5, 2024 at 4:29 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > > > > > > > > > > > > > so I went ahead and the fix does look simple: > > > > > > > https://github.com/eddyz87/bpf/tree/skb-pull-data-global-func-bug > > > > > > > > > > > > Looks simple enough to me. > > > > > > Ship it for bpf tree. > > > > > > If we can come up with something better we can do it later in bpf-next. > > > > > > > > > > > > I very much prefer to avoid complexity as much as possible. > > > > > > > > > > Sent the patch-set for "simple". > > > > > It is better then "dumb" by any metric anyways. > > > > > Will try what Andrii suggests, as allowing calling global sub-programs > > > > > from non-sleepable context sounds interesting. > > > > > > > > > > > > > I haven't looked at your patches yet, but keep in mind another gotcha > > > > with subprograms: they can be freplace'd by another BPF program > > > > (clearly freplace programs were a successful reduction of > > > > complexity... ;) > > > > > > > > What this means in practice is whatever deductions you get out of > > > > analyzing any specific original subprogram might be violated by > > > > freplace program if we don't enforce them during freplace attachment. > > > > > > > > > > > > Anyways, I came here to say that I think I have a much simpler > > > > solution that won't require big changes to the BPF verifier: tags. We > > > > can shift the burden to the user having to declare the intent upfront > > > > through subprog tags. And then, during verification of that global > > > > subprog, the verifier can enforce that only explicitly declared side > > > > effects can be enacted by the subprogram's code (taking into account > > > > lazy dead code detection logic). > > > > > > > > We already take advantage of declarative tags for global subprog args > > > > (__arg_trusted, etc), we can do the same for the function itself. We > > > > can have __subprog_invalidates_all_pkt_pointers tag (and yes, I do > > > > insist on this laconic name, of course), and during verification of > > > > subprogram we just make sure that subprog was annotated as such, if > > > > one of those fancy helpers is called directly in subprog itself or > > > > transitively through any of *actually* called subprogs. > > > > > > tags for args was an aid to the verifier. Nothing is broken without them. > > > Here it's about correctness. > > > So we cannot use tags to solve this case. > > > > Hm.. Just like without an arg tag, verifier would conservatively > > assume that `struct task_struct *task` global subprog argument is just > > some opaque memory, not really a task, and would verify that argument > > and code working with it as such. If a user did something that > > required extra task_struct semantics, then that would be a > > verification error. Unless the user added __arg_trusted, of course. > > > > Same thing here. We *assume* that global subprog doesn't have this > > packet pointers side effect. If later during verification it turns out > > it does have this effect -- this is an error and subprog gets > > rejected. Unless the user provided > > __subprog_invalidates_all_pkt_pointers, of course. Same thing. > > So depending on the order of walking the progs, compiler layout, > static vs global the extra tag is either mandatory or not. How so? If the verifier can *reach* one of those special helpers during verification, then this tag would be required *for global subprog*. Or, *importantly*, if user anticipates that "freplace-ment" BPF program for such subprog might need to invalidate packet pointers, but the default subprog implementation doesn't actually call any of those special helpers, user can just explicitly say that "yes, this subprog should be treated as such that invalidates pkt pointers". With your approach there is no way to even express this, unless you hack default subprog implementation to intentionally have reachable pkt-invalidating helper, but not really call it at runtime. Think about some more advanced XDP chainer approach, where replacement slots would need this pkt invalidation capabilities (but default subprogs just are no-ops). > That is horrible UX. I really don't like moving the burden to the user > when the verifier can see it all. I disagree, but it doesn't matter. It's being clear and explicit about functionality that global subprog (verified in isolation) needs right now or might need later (e.g., due to freplace-ment) > arg_ctx is different. The verifier just doesn't have the knowledge. No, it's not. It's conceptually absolutely the same. Verifier can derive that global subprog arg has to be a trusted pointer. It's just that with pkt invalidation it's trivial enough to detect (crudely and eagerly, but still) in check_cfg(), while for trusted pointers you can't take this shortcut. I don't like the check_cfg()-based approach, it's hacky and subpar workaround, and goes against all the lazy verification philosophy we pursued with BPF CO-RE. I'm happy to discuss this offline if that will be easier to get through all the nuance, but if you guys insist, so be it.