On Tue, Sep 19, 2023 at 5:06 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > On Tue, 2023-09-19 at 16:14 -0700, Andrii Nakryiko wrote: > [...] > > > Hi All, > > > > > > This issue seems stalled, so I took a look over the weekend. > > > I have a work in progress fix, please let me know if you don't agree > > > with direction I'm taking or if I'm stepping on anyone's toes. > > > > > > After some analysis I decided to go with Alexei's suggestion and > > > implement something similar to iterators for selected set of helper > > > functions that accept "looping" callbacks, such as: > > > - bpf_loop > > > - bpf_for_each_map_elem > > > - bpf_user_ringbuf_drain > > > - bpf_timer_set_callback (?) > > Hi Andrii, thank you for taking a look. > > > As Kumar mentioned, pretty much all helpers with callbacks are either > > 0-1, or 0-1-many cases, so all of them (except for the async callback > > ones that shouldn't be accepting parent stack pointers) should be > > validated. > > Yes, makes sense. I need to finish the fix for bpf_loop first, as it > seems as a good representative for many possible issues. yes, it's the most generic 0-1-many case > > > > The sketch of the fix looks as follows: > > > - extend struct bpf_func_state with callback iterator state information: > > > - active/non-active flag > > > > not sure why you need this flag > > Note: > I have a better version of the fix locally but will share it a bit later. > It actually depends on states_equal() discussion in the sibling thread. > > In mu upgraded version I use non-zero depth as an indicator the. > So no separate flag in the end. great > > > > - depth > > > > yep, this seems necessary > > > > > - r1-r5 state at the entry to callback; > > > > not sure why you need this, this should be part of func_state already? > > This was a bit tricky but I think I figured an acceptable solution w/o > extra copies for r1-r5. The tricky part is the structure of > check_helper_call(): > - collect arguments 'meta' info & check arguments > - call __check_func_call(): > - setup frame for callback; > - schedule next instruction index to be callback entry; > - reset r1-r5 in caller's frame; > - set r0 in caller's frame. > > The problem is that check_helper_call() resets caller's r1-r5 > immediately. I figured that this reset could be done at BPF_EXIT > processing for callback instead => no extra copy needed. > I guess then r0 setting would have to happen at BPF_EXIT as well, right? Is that a problem? > > > - extend __check_func_call() to setup callback iterator state when > > > call to suitable helper function is processed; > > > > this logic is "like iterator", but it's not exactly the same, so I > > don't think we should reuse bpf_iter state (as you can see above with > > active/non-active flag, for example) > > Yes, I agree, already removed this locally. cool > > > > - extend BPF_EXIT processing (prepare_func_exit()) to queue new > > > callback visit upon return from iterating callback > > > (similar to process_iter_next_call()); > > > > as mentioned above, we should simulate "no callback called" situation > > as well, don't forget about that > > Yeap > > > > - extend is_state_visited() to account for callback iterator hits in a > > > way similar to regular iterators; > > > - extend backtrack_insn() to correctly react to jumps from callback > > > exit to callback entry. > > > > > > I have a patch (at the end of this email) that correctly recognizes > > > the bpf_loop example in this thread as unsafe. However this patch has > > > a few deficiencies: > > > > > > - verif_scale_strobemeta_bpf_loop selftest is not passing, because > > > read_map_var function invoked from the callback continuously > > > increments 'payload' pointer used in subsequent iterations. > > > > > > In order to handle such code I need to add an upper bound tracking > > > for iteration depth (when such bound could be deduced). > > > > if the example is broken and really can get out of bounds, we should > > fix an example to be provable within bounds no matter how many > > iterations it was called with > > For that particular case number of iterations guarantees that payload > pointer will not get out of bounds. It is bumped up 4 bytes on each > iteration, but the number of iterations and buffer size correlate to > avoid out of bound access. ok, good, I was trying to avoid deducing bounds on the number of iterations or anything like that. > > > > - loop detection is broken for simple callback as below: > > > > > > static int loop_callback_infinite(__u32 idx, __u64 *data) > > > { > > > for (;;) > > > (*ctx)++; > > > return 0; > > > } > > > > > > To handle such code I need to change is_state_visited() to do > > > callback iterator loop/hit detection on exit from callback > > > (returns are not prune points at the moment), currently it is done > > > on entry. > > > > I'm a bit confused. What's ctx in the above example? And why loop > > detection doesn't detect for(;;) loop right now? > > It's an implementation detail for the fix sketch shared in the parent > email. It can catch cases like this: > > ... any insn ...; > for (;;) > (*ctx)++; > return 0; > > But cannot catch case like this: > > for (;;) > (*ctx)++; > return 0; > > In that sketch I jump to the callback start from callback return and > callback entry needs two checks: > - iteration convergence > - simple looping > Because of the code structure only iteration convergence check was done. > Locally, I fixed this issue by jumping to the callback call instruction > instead. wouldn't this be a problem for just any subprog if we don't check the looping condition on the entry instruction? Perhaps that's a separate issue that needs generic fix? > > > > - the callback as bellow currently causes state explosion: > > > > > > static int precise1_callback(__u32 idx, struct precise1_ctx *ctx) > > > { > > > if (idx == 0) > > > ctx->a = 1; > > > if (idx == 1 && ctx->a == 1) > > > ctx->b = 1; > > > return 0; > > > } > > > > why state explosion? there should be a bunch of different branches > > (idx 0, 1, something else x ctx->a = 1 or not 1 and ctx->b being 1 or > > not), but it should be a fixed number of states? Do you know what > > causes the explosion? > > I forgot to do mark_force_checkpoint() at callback entry. Fixed locally. ok, makes sense > > > > > > > > > I'm not sure yet what to do about this, there are several possibilities: > > > - tweak the order in which states are visited (need to think about it); > > > - check states in bpf_verifier_env::head (not explored yet) for > > > equivalence and avoid enqueuing duplicate states. > > > > > > I'll proceed addressing the issues above on Monday. > > > > > > Thanks, > > > Eduard > > > > > > --- > > > > > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h > > > index a3236651ec64..5589f55e42ba 100644 > > > --- a/include/linux/bpf_verifier.h > > > +++ b/include/linux/bpf_verifier.h > > > @@ -70,6 +70,17 @@ enum bpf_iter_state { > > > BPF_ITER_STATE_DRAINED, > > > }; > > > > > > +struct bpf_iter { > > > + /* BTF container and BTF type ID describing > > > + * struct bpf_iter_<type> of an iterator state > > > + */ > > > + struct btf *btf; > > > + u32 btf_id; > > > + /* packing following two fields to fit iter state into 16 bytes */ > > > + enum bpf_iter_state state:2; > > > + int depth:30; > > > +}; > > > + > > > struct bpf_reg_state { > > > /* Ordering of fields matters. See states_equal() */ > > > enum bpf_reg_type type; > > > @@ -115,16 +126,7 @@ struct bpf_reg_state { > > > } dynptr; > > > > > > /* For bpf_iter stack slots */ > > > - struct { > > > - /* BTF container and BTF type ID describing > > > - * struct bpf_iter_<type> of an iterator state > > > - */ > > > - struct btf *btf; > > > - u32 btf_id; > > > - /* packing following two fields to fit iter state into 16 bytes */ > > > - enum bpf_iter_state state:2; > > > - int depth:30; > > > - } iter; > > > + struct bpf_iter iter; > > > > Let's not do this, conceptually processes are similar, but bpf_iter is > > one thing, and this callback validation is another thing. Let's not > > conflate things. > > > > > > > > /* Max size from any of the above. */ > > > struct { > > > > [...] >