Re: Packet pointer invalidation and subprograms

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Fri, 6 Dec 2024 09:42:12 -0800

On Fri, Dec 6, 2024 at 8:20 AM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Fri, Dec 6, 2024 at 8:13 AM Andrii Nakryiko
> <andrii.nakryiko@xxxxxxxxx> wrote:
> >
> > On Fri, Dec 6, 2024 at 8:08 AM Alexei Starovoitov
> > <alexei.starovoitov@xxxxxxxxx> wrote:
> > >
> > > On Thu, Dec 5, 2024 at 10:23 PM Andrii Nakryiko
> > > <andrii.nakryiko@xxxxxxxxx> wrote:
> > > >
> > > > On Thu, Dec 5, 2024 at 8:07 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
> > > > >
> > > > > On Thu, 2024-12-05 at 17:44 -0800, Alexei Starovoitov wrote:
> > > > > > On Thu, Dec 5, 2024 at 4:29 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
> > > > > > >
> > > > > > > so I went ahead and the fix does look simple:
> > > > > > > https://github.com/eddyz87/bpf/tree/skb-pull-data-global-func-bug
> > > > > >
> > > > > > Looks simple enough to me.
> > > > > > Ship it for bpf tree.
> > > > > > If we can come up with something better we can do it later in bpf-next.
> > > > > >
> > > > > > I very much prefer to avoid complexity as much as possible.
> > > > >
> > > > > Sent the patch-set for "simple".
> > > > > It is better then "dumb" by any metric anyways.
> > > > > Will try what Andrii suggests, as allowing calling global sub-programs
> > > > > from non-sleepable context sounds interesting.
> > > > >
> > > >
> > > > I haven't looked at your patches yet, but keep in mind another gotcha
> > > > with subprograms: they can be freplace'd by another BPF program
> > > > (clearly freplace programs were a successful reduction of
> > > > complexity... ;)
> > > >
> > > > What this means in practice is whatever deductions you get out of
> > > > analyzing any specific original subprogram might be violated by
> > > > freplace program if we don't enforce them during freplace attachment.
> > > >
> > > >
> > > > Anyways, I came here to say that I think I have a much simpler
> > > > solution that won't require big changes to the BPF verifier: tags. We
> > > > can shift the burden to the user having to declare the intent upfront
> > > > through subprog tags. And then, during verification of that global
> > > > subprog, the verifier can enforce that only explicitly declared side
> > > > effects can be enacted by the subprogram's code (taking into account
> > > > lazy dead code detection logic).
> > > >
> > > > We already take advantage of declarative tags for global subprog args
> > > > (__arg_trusted, etc), we can do the same for the function itself. We
> > > > can have __subprog_invalidates_all_pkt_pointers tag (and yes, I do
> > > > insist on this laconic name, of course), and during verification of
> > > > subprogram we just make sure that subprog was annotated as such, if
> > > > one of those fancy helpers is called directly in subprog itself or
> > > > transitively through any of *actually* called subprogs.
> > >
> > > tags for args was an aid to the verifier. Nothing is broken without them.
> > > Here it's about correctness.
> > > So we cannot use tags to solve this case.
> >
> > Hm.. Just like without an arg tag, verifier would conservatively
> > assume that `struct task_struct *task` global subprog argument is just
> > some opaque memory, not really a task, and would verify that argument
> > and code working with it as such. If a user did something that
> > required extra task_struct semantics, then that would be a
> > verification error. Unless the user added __arg_trusted, of course.
> >
> > Same thing here. We *assume* that global subprog doesn't have this
> > packet pointers side effect. If later during verification it turns out
> > it does have this effect -- this is an error and subprog gets
> > rejected. Unless the user provided
> > __subprog_invalidates_all_pkt_pointers, of course. Same thing.
>
> So depending on the order of walking the progs, compiler layout,
> static vs global the extra tag is either mandatory or not.

How so? If the verifier can *reach* one of those special helpers
during verification, then this tag would be required *for global
subprog*.

Or, *importantly*, if user anticipates that "freplace-ment" BPF
program for such subprog might need to invalidate packet pointers, but
the default subprog implementation doesn't actually call any of those
special helpers, user can just explicitly say that "yes, this subprog
should be treated as such that invalidates pkt pointers". With your
approach there is no way to even express this, unless you hack default
subprog implementation to intentionally have reachable
pkt-invalidating helper, but not really call it at runtime.

Think about some more advanced XDP chainer approach, where replacement
slots would need this pkt invalidation capabilities (but default
subprogs just are no-ops).

> That is horrible UX. I really don't like moving the burden to the user
> when the verifier can see it all.

I disagree, but it doesn't matter. It's being clear and explicit about
functionality that global subprog (verified in isolation) needs right
now or might need later (e.g., due to freplace-ment)

> arg_ctx is different. The verifier just doesn't have the knowledge.

No, it's not. It's conceptually absolutely the same. Verifier can
derive that global subprog arg has to be a trusted pointer. It's just
that with pkt invalidation it's trivial enough to detect (crudely and
eagerly, but still) in check_cfg(), while for trusted pointers you
can't take this shortcut.

I don't like the check_cfg()-based approach, it's hacky and subpar
workaround, and goes against all the lazy verification philosophy we
pursued with BPF CO-RE. I'm happy to discuss this offline if that will
be easier to get through all the nuance, but if you guys insist, so be
it.