On Tue, 2023-10-03 at 19:57 -0700, Alexei Starovoitov wrote: > > Yes. This may be an issue. I'll try to hack a layered variant I talked > > before to see what are the underlying issues. The idea is to verify > > every child state of the loop entry before moving to any branches of > > loop's parent states. > > Isn't that what current DFS logic does? With a twist: it should wait for local loop convergence before declaring loop entry state safe. I described it in [1]. Upon entry to a top level loop state E: 1. do DFS for E, disallow switching to any branches that are not children of E, collect all loop states generated by E's DFS; 3. reschedule loop states; 4. do DFS for rescheduled states, collect all loop states generated in the process; 5. repeat from 3 until no loop state could be rescheduled; 6. mark E safe, proceed from regular states stack. [1] https://lore.kernel.org/bpf/8b75e01d27696cd6661890e49bdc06b1e96092c7.camel@xxxxxxxxx/ > > The idea is that when `env->stack.size == 0` all V's non-delayed > > children states are explored and relevant precision marks are > > propagated to V. > > But that is not true. > if (is_iter_next_insn(env, insn_idx)) { > if (states_equal(env, &sl->state, cur)) { > with sl->state.branches > 0 > will prevent exploration of all children. Yes, it would postpone exploration of some children, but it would be given a chance to reschedule. > Hence I still believe that fine tunning this equavalence check is > the first step in any fix. Maybe, but so far states_equal(V, C) seems ok: 1. values read by V are defined in C; 2. non-precise scalars are equivalent unconditionally; 3. precise scalars in C are sub-ranges of precise scalars in V; 4. non-scalar values are compared exactly (modulo idmap). Which of these points are overly optimistic in your opinion? > > If at this point some states_equal(V, C') is false it > > is necessary to schedule C' for one more exploration round as it might > > visit some code blocks that were not visited on the path from V to C' > > (as different predictions decisions could be made). > > exactly my point above, > but because of broken double 'if' above the 2nd pass might be hitting > the same issues as the first. Because some path wasn't explored > the precision marks might still be wrong. > > More loop states can be created and delayed again into loop_stack? Yes, and new reschedule would be done and so on. > > If a point is reached when for all loop states C' there are some > > states V such that states_equal(V, C'), there are no more > > configurations that can visit code blocks not visited on the path from > > V to C', as prediction decisions would be the same. > > and it not, we can repeat loop_stack->stack->loop_stack potentially forever? It might be the case that such logic would not converge, e.g. as in the following simple example: it = iter_new(); i = 0; while (iter_next(it)) { if (i & 0x1 == 0) printk("'i' is precise and different in each state"); ++i; } iter_destroy(it); However: - we have such limitation already; - examples like this don't play well with current states pruning logic in general (I think that doing some opportunistic widening of precise values is an interesting idea (loosing or worsening precision with ability to backtrack on failure), but this is a different topic). If loop logic would not converge verifier would report an error: either instruction or jump limit would be reached. For the cases when this logic does converge we end up in a state when no new read or precision marks could be gained, thus it is safe to accept states_equal == true verdicts. > Did you look at my broken patch? yes. it's buggy, but it demonstrates > the idea where it removes above problematic double 'if' and uses > states_equal() on states where st->branches==0 && st->looping_states==1 <replied in a different email> > meaning that in those states all paths were explored except the one > forked by iter_next kfunc. > So precision and liveness is correct, That's probably true. > while doing states_equal on V with branches=2+ is broken. Not that it is broken, but it's meaning is "maybe a sub-state" instead of "definitely a sub-state".