On Sun, Sep 17, 2023 at 2:37 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > On Fri, 2023-07-07 at 11:08 -0700, Andrii Nakryiko wrote: > > On Fri, Jul 7, 2023 at 9:44 AM Alexei Starovoitov > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > On Fri, Jul 7, 2023 at 7:04 AM Andrew Werner <awerner32@xxxxxxxxx> wrote: > > > > > > > > When it comes to fixing the problem, I don't quite know where to start. > > > > Perhaps these iteration callbacks ought to be treated more like global functions > > > > -- you can't always make assumptions about the state of the data in the context > > > > pointer. Treating the context pointer as totally opaque seems bad from > > > > a usability > > > > perspective. Maybe there's a way to attempt to verify the function > > > > body of the function > > > > by treating all or part of the context as read-only, and then if that > > > > fails, go back and > > > > assume nothing about that part of the context structure. What is the > > > > right way to > > > > think about plugging this hole? > > > > > > 'treat as global' might be a way to fix it, but it will likely break > > > some setups, since people passing pointers in a context and current > > > global func verification doesn't support that. > > > > yeah, the intended use case is to return something from callbacks > > through context pointer. So it will definitely break real uses. > > > > > I think the simplest fix would be to disallow writes into any pointers > > > within a ctx. Writes to scalars should still be allowed. > > > > It might be a partial mitigation, but even with SCALARs there could be > > problems due to assumed SCALAR range, which will be invalid if > > callback is never called or called many times. > > > > > Much more complex fix would be to verify callbacks as > > > process_iter_next_call(). See giant comment next to it. > > > > yep, that seems like the right way forward. We need to simulate 0, 1, > > 2, .... executions of callbacks until we validate and exhaust all > > possible context modification permutations, just like open-coded > > iterators do it > > > > > But since we need a fix for stable I'd try the simple approach first. > > > Could you try to implement that? > > Hi All, > > This issue seems stalled, so I took a look over the weekend. > I have a work in progress fix, please let me know if you don't agree > with direction I'm taking or if I'm stepping on anyone's toes. > > After some analysis I decided to go with Alexei's suggestion and > implement something similar to iterators for selected set of helper > functions that accept "looping" callbacks, such as: > - bpf_loop > - bpf_for_each_map_elem > - bpf_user_ringbuf_drain > - bpf_timer_set_callback (?) As Kumar mentioned, pretty much all helpers with callbacks are either 0-1, or 0-1-many cases, so all of them (except for the async callback ones that shouldn't be accepting parent stack pointers) should be validated. > > The sketch of the fix looks as follows: > - extend struct bpf_func_state with callback iterator state information: > - active/non-active flag not sure why you need this flag > - depth yep, this seems necessary > - r1-r5 state at the entry to callback; not sure why you need this, this should be part of func_state already? > - extend __check_func_call() to setup callback iterator state when > call to suitable helper function is processed; this logic is "like iterator", but it's not exactly the same, so I don't think we should reuse bpf_iter state (as you can see above with active/non-active flag, for example) > - extend BPF_EXIT processing (prepare_func_exit()) to queue new > callback visit upon return from iterating callback > (similar to process_iter_next_call()); as mentioned above, we should simulate "no callback called" situation as well, don't forget about that > - extend is_state_visited() to account for callback iterator hits in a > way similar to regular iterators; > - extend backtrack_insn() to correctly react to jumps from callback > exit to callback entry. > > I have a patch (at the end of this email) that correctly recognizes > the bpf_loop example in this thread as unsafe. However this patch has > a few deficiencies: > > - verif_scale_strobemeta_bpf_loop selftest is not passing, because > read_map_var function invoked from the callback continuously > increments 'payload' pointer used in subsequent iterations. > > In order to handle such code I need to add an upper bound tracking > for iteration depth (when such bound could be deduced). if the example is broken and really can get out of bounds, we should fix an example to be provable within bounds no matter how many iterations it was called with > > - loop detection is broken for simple callback as below: > > static int loop_callback_infinite(__u32 idx, __u64 *data) > { > for (;;) > (*ctx)++; > return 0; > } > > To handle such code I need to change is_state_visited() to do > callback iterator loop/hit detection on exit from callback > (returns are not prune points at the moment), currently it is done > on entry. I'm a bit confused. What's ctx in the above example? And why loop detection doesn't detect for(;;) loop right now? > > - the callback as bellow currently causes state explosion: > > static int precise1_callback(__u32 idx, struct precise1_ctx *ctx) > { > if (idx == 0) > ctx->a = 1; > if (idx == 1 && ctx->a == 1) > ctx->b = 1; > return 0; > } why state explosion? there should be a bunch of different branches (idx 0, 1, something else x ctx->a = 1 or not 1 and ctx->b being 1 or not), but it should be a fixed number of states? Do you know what causes the explosion? > > I'm not sure yet what to do about this, there are several possibilities: > - tweak the order in which states are visited (need to think about it); > - check states in bpf_verifier_env::head (not explored yet) for > equivalence and avoid enqueuing duplicate states. > > I'll proceed addressing the issues above on Monday. > > Thanks, > Eduard > > --- > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h > index a3236651ec64..5589f55e42ba 100644 > --- a/include/linux/bpf_verifier.h > +++ b/include/linux/bpf_verifier.h > @@ -70,6 +70,17 @@ enum bpf_iter_state { > BPF_ITER_STATE_DRAINED, > }; > > +struct bpf_iter { > + /* BTF container and BTF type ID describing > + * struct bpf_iter_<type> of an iterator state > + */ > + struct btf *btf; > + u32 btf_id; > + /* packing following two fields to fit iter state into 16 bytes */ > + enum bpf_iter_state state:2; > + int depth:30; > +}; > + > struct bpf_reg_state { > /* Ordering of fields matters. See states_equal() */ > enum bpf_reg_type type; > @@ -115,16 +126,7 @@ struct bpf_reg_state { > } dynptr; > > /* For bpf_iter stack slots */ > - struct { > - /* BTF container and BTF type ID describing > - * struct bpf_iter_<type> of an iterator state > - */ > - struct btf *btf; > - u32 btf_id; > - /* packing following two fields to fit iter state into 16 bytes */ > - enum bpf_iter_state state:2; > - int depth:30; > - } iter; > + struct bpf_iter iter; Let's not do this, conceptually processes are similar, but bpf_iter is one thing, and this callback validation is another thing. Let's not conflate things. > > /* Max size from any of the above. */ > struct { [...]