On Wed, Sep 27, 2023 at 6:09 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > On Tue, 2023-09-26 at 09:25 -0700, Andrii Nakryiko wrote: > [...] > > > In other words there is a function states_equal' for comparison of > > > states when old{.branches > 0}, which differs from states_equal in > > > the following way: > > > - considers all registers read; > > > - considers all scalars precise. > > > > > > > Not really. The important aspect is to mark registers that were > > required to be imprecise in old state as "required to be imprecise", > > and if later we decide that this register has to be precise, too bad, > > too late, game over (which is why I didn't propose it, this seems too > > restrictive). > > Could you please elaborate a bit? What's wrong with the following: > Suppose I see a register R that differs between V and C an is not > precise in both. I fork C as C', mark R unbound in C' and proceed with > C' verification. At some point during that verification I see that > some precise R's value is necessary, thus C' verification fails. > If that happens verification resumes from C, otherwise C is discarded. > I also postpone read and precision marks propagation from C' to it's > parent until C' verification succeeds (if it succeeds). Nothing wrong, I'm just saying your C' derivation gives a chance to undo things, while my eager idea wouldn't allow to do that. And that's bad, so I didn't propose any of that. Aside from that, I'd be curious to see how to prevent read/precision marks propagation to V, seems like yet another hack and special case, which just seems like yet another complication. > > [...] > > 1. If V and C (using your terminology from earlier, where V is the old > > parent state at some next() call instruction, and C is the current one > > on the same instruction) are different -- we just keep going. So > > always try to explore different input states for the loop. > > > > 2. But if V and C are equivalent, it's too early to conclude anything. > > So enqueue C for later in a separate BFS queue (and perhaps that queue > > is per-instruction, actually; or maybe even per-state, not sure), and > > keep exploring all the other pending queues from the (global) DFS > > queue, until we get back to state V again. At that point we need to > > start looking at postponed states for that V state. But this time we > > should be sure that precision and read marks are propagated from all > > those terminatable code paths. > > > > Basically, this tries to make sure that we do mark every register that > > is important for all the branching decision making, memory > > dereferences, etc. And just avoids going into endless loops with the > > same input conditions. > > > > Give it some fresh thought and let's see if we are missing something > > again. Thanks! > > This should work for examples we've seen so far. > Why do you think a separate per-instruction queue is necessary? I'm not positive, it was just a possibility that it might matter. I'd try with global queue first and tried to break the approach. > The way I read it the following algorithm should suffice: > - add a field bpf_verifier_env::iter_head similar to 'head' but for > postponed looping states; > - add functions push_iter_stack(), pop_iter_stack() similar to > push_stack() and pop_stack(); I don't like the suggested naming, it's too iter-centric, and it's actually a queue, not a stack, etc. But that's something to figure out later. > - modify is_state_visited() as follows: > > static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) > { > ... > while (sl) { > ... > if (sl->state.branches) { > ... > if (is_iter_next_insn(env, insn_idx)) { > if (states_equal(env, &sl->state, cur)) { > ... > iter_state = &func(env, iter_reg)->stack[spi].spilled_ptr; > if (iter_state->iter.state == BPF_ITER_STATE_ACTIVE) { > + // Don't want to proceed with 'cur' verification, > + // push it to iters queue to check again if states > + // are still equal after env->head is exahusted. > + if (env->stack_size != 0) > + push_iter_stack(env, cur, ...); > goto hit; > } > } > goto skip_inf_loop_check; > } > ... > } > > - modify do_check() to do pop_iter_stack() if pop_stack() is > exhausted, the popped state would get into is_state_visited() and > checked against old state, which at that moment should have all > read/precision masks that env->head could have provided. Yeah, something like that. > > After working on "widening conjectures" implementation a bit this > approach seems to be much simpler. Need to think harder if I can break it. yeah, I think this C' derivation while conceptually seems straightforward, probably has tons of non-obvious complications.