Re: Packet pointer invalidation and subprograms

Eduard Zingerman <eddyz87@xxxxxxxxx> · Thu, 05 Dec 2024 16:29:08 -0800

On Thu, 2024-12-05 at 16:03 -0800, Andrii Nakryiko wrote:

[...]

> > There are several ways to fix this:
> > - The "dumb" way:
> >   - forbid calling helpers that bpf_helper_changes_pkt_data()
> >     from global functions.
> > - The "simple" way:
> >   - at some early stage:
> >     - scan all global functions, to see if there are any calls to
> >       helpers that bpf_helper_changes_pkt_data(). If there are,
> >       remember this as an "effect" of the function;
> >     - build a call-graph of global functions and propagate computed
> >       effects over this call-graph (if A calls B and B does
> >       clear_all_pkt_pointers(), then A also does it).
> >   - during main verification phase, if a call to a global function is
> >     verified, check it's effects and update state accordingly
> >     (e.g. call clear_all_pkt_pointers()).
> > - The "correct" way:
> >   - build a call-graph of global functions;
> >   - verify these functions in a post-order;
> >   - while verifying, collect "effects" information
> >     (so far, the single effect is whether or not
> >      clear_all_pkt_pointers() had been ever called for the function);
> 
> So this is the only "side effect" we have right now? Are there any
> others we have already or can reasonably anticipate? I'm just trying
> to decide if we need to generalize this concept.

Don't have anything to add to Kumar's answer.

> >   - if a call to global function is verified, check it's effects and
> >     update state accordingly (e.g. call clear_all_pkt_pointers()).
> > 
> > "dumb" is probably a no-go as it is too restrictive.
> > The only advantage of "simple" over "correct" that I see is
> > that the logic for clear_all_pkt_pointers() remains confined
> > to check_helper_call() and is not duplicated in a separate pass.
> 
> "simple" doesn't take into account dead code elimination, undermining
> BPF CO-RE-based feature detection, so I think this is also a no-go

That's a bummer :)
I takled with Alexei yesterday and he preferred "simple",
so I went ahead and the fix does look simple:
https://github.com/eddyz87/bpf/tree/skb-pull-data-global-func-bug

> > > In theory, this also allows to compute more complex function effects
> > on the main verification pass.
> > 
> > I think "simple" is a way to go at the moment.
> 
> I think neither of the above are fully valid, tbh. "correct" will do
> eager subprog validation, even if due to dead code elimination that
> global function might not have been called.
> 
> From the outset, I think the "right" way to solve this would be to
> start verification from the main program. When we encounter global
> subprog verification, we pause verification for the main program,
> create a new isolated verifier state, proceed with global subprog
> verification, and so on until we check everything. So basically a
> stack of subprogs to validate.

This might be not that hard to do, actually.

If global function had not been verified yet, and a call to such
function is encountered:
- setup arguments as for global function verification;
- keep stack and BPF_EXIT processing as for non-global functions;

> This is PITA, of course, just for this (which is also the question
> about the generalization of the "side effects" concept). So I don't
> know, maybe for now the "dumb" way is the way?

Idk, "dumb" seems too restrictive.