On Wed, Oct 04, 2023 at 12:52:07AM +0300, Eduard Zingerman wrote: > On Tue, 2023-10-03 at 11:50 -0700, Alexei Starovoitov wrote: > > On Tue, Oct 3, 2023 at 8:33 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > > > > > When I put states to the loop stack I do copy_verifier_state() and > > > increase .branches counter for each state parent, so this should not > > > trigger warnings with update_branch_counts(). > > > > No warn doesn't mean it's correct. > > I suspect what your reschedule_loop_states() will increase > > branches from 0 to 1. > > Which is equivalent to increasing refcnt from zero. Not good. > > Not really, here is how I modified bpf_verifier_env: > > struct bpf_verifier_state_stack { > struct bpf_verifier_stack_elem *head; /* stack of verifier states to be processed */ > int size; /* number of states to be processed */ > }; > > struct bpf_verifier_env { > ... > struct bpf_verifier_state_stack stack; > struct bpf_verifier_state_stack loop_stack; > ... > } > > Here env->stack is used for DFS traversal and env->loop_stack is used > to delay verification of the loop states. > > When bpf_iter_next() is reached in state C and states_equal() shows > that there is potentially equivalent state V: > - copy C' of C is created sing copy_verifier_state(), it updates all > branch counters up the ownership chain as with any other state; > - C' is put to env->loop_stack. and C' points to normal parent and increments its branches as part of __push_stack(). By delaying all looping states that looks to be equal to V states due to broken precision marks the verifier will walk almost everything after the loop, converge all branches to 0, then clean_live_states() and REG_LIVE_DONE in the bottom part, then it will proceed to verify all branches before the loop, but since their branch counter is stuck due to states in loop_stack, the upper part of the program won't get the same treatment. update_branch_counts() will never go past the point where C' was put on loop_stack, so every state before the loop will have branches > 0 and verification of branches before the loop will not do any state pruning at all. void foo() { if (...) { .. } // no pruning if (...) { .. } // no pruning bpf_for(...) if (...) { .. } // all ok } Essentially any complex code before the loop has a high chance of the verifier state explosion. > It is possible and correct to propagate liveness and precision from V > to C when loop steady state is achieved, as at that point we know for > sure that C is a sub-state of V. However, currently loop states are > tracked globally and no states are processed after loops steady state > is reached, hence I don't do liveness and precision propagation. Hmm. I think the code is doing it: if (is_iter_next_insn(env, insn_idx)) { if (states_equal(env, &sl->state, cur)) { push_loop_stack(env, insn_idx, env->prev_insn_idx); goto hit; hit: propagate_liveness() > The necessity to visit all grandchildren states of V leads to > necessity of mixed DFS and BFS traversal of the loop states. DFS + BFS traversal doesn't guarantee safety. We delayed looping states in loop_stack, but it doesn't mean that the loop body reached a steady state. Your own num_iter_bug() is an example. The verifier will miss exploring path with r7 = 0xdead. When C' is popped from loop_stack there is chance it will explore them, but there is no guarantee. For this particular num_iter_bug() case the loop processing will eventually propagate precision marks and retrying with C' later, the C' will be correctly rejected, since precision marks are there. But I think that is more due to luck and the shape of the test. if (is_iter_next_insn(env, insn_idx)) { if (states_equal(env, &sl->state, cur)) { is a foot gun. It finds broken equivalence and potentially skipping whole blocks of code. What guarantees that C' from loop_stack will explore them? I think avoiding states_equal(V, cur) when V state didn't converge is a mandatory part of the fix. I can see that little bit of out-of-order state processing probably is still correct, but delaying loop_stack all the way until env->stack.size == 0 is just broken. For example we can push looping state in process_iter_next_call() not the top of the stack, but place it after branches with insn_idx >= iter_next's insn_idx. This way the delayed looping states will still be processed around loop processing and every code block before the loop will converge to branches = 0. To do that we don't need another loop_stack. I think we need a call to converge. Office hours on Thu at 9am? Or tomorrow at 9am?