On Thu, Sep 21, 2023 at 6:01 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > On Thu, 2023-09-21 at 21:16 +0300, Eduard Zingerman wrote: > > On Thu, 2023-09-21 at 09:35 -0700, Andrii Nakryiko wrote: > > > I've been thinking in a similar direction as Alexei, overnight. Here > > > are some raw thoughts. > > > > > > I think the overall approach with iter verification is sound. If we > > > loop and see an equivalent state at the entry to next loop iteration, > > > then it's safe to assume doing many iterations is safe. The problem is > > > that delayed precision and read marks make this state equivalence > > > wrong in some case. So we need to find a solution that will ensure > > > that all precision and read marks are propagated to parent state > > > before making a decision about convergence. > > > > > > The idea is to let verifier explore all code paths starting from > > > iteration #N, except the code paths that lead to looping into > > > iteration #N+1. I tried to do that with validating NULL case first and > > > exiting from loop on each iteration (first), but clearly that doesn't > > > capture all the cases, as Eduard have shown. > > > > > > So what if we delay convergence state checks (and then further > > > exploration at iteration #N+1) using BFS instead of DFS? That is, when > > > we are about to start new iteration and check state convergence, we > > > enqueue current state to be processed later after all the states that > > > "belong" to iteration #N are processed. > > > > This sounds correct if one iterator is involved. > > > > > We should work out exact details on how to do this hybrid BFS+DFS, but > > > let's think carefully if this will solve the problems? > > > > > > I think this is conceptually similar to what Alexei proposed above. > > > Basically, we "unroll" loop verification iteration by iteration, but > > > make sure that we explore all the branching within iteration #N before > > > going one iteration deeper. > > > > > > Let's think if there are any cases which wouldn't be handled. And then > > > think how to implement this elegantly (e.g., some sort of queue within > > > a parent state, which sounds similar to this separate "branches" > > > counter that Alexei is proposing above). > > > > To better understand the suggestion, suppose there is some iterator > > 'I' and two states: > > - state S1 where depth(I) == N and pending instruction is "next(I)" > > - state S2 where depth(I) == N and pending instruction is *not* "next(I)" > > In such situation state S2 should be verified first, right? > > And in general, any state that is not at "next(I)" should be explored > > before S1, right? > > > > Such interpretation seems to be prone to deadlocks, e.g. suppose there > > are two iterators: I1 and I2, and two states: > > - state S1 with depth(I1) == N, depth(I2) == M, at instruction "next(I1)"; > > - state S2 with depth(I1) == N, depth(I2) == M, at instruction "next(I2)". > > > > E.g. like in the following loop: > > > > for (;;) { > > if (<random condition>) > > if (!next(it1)) break; // a > > else > > if (!next(it2)) break; // b > > ... > > } > > > > I think it is possible to get to two states here: > > - (a) it1(active, depth 0), it2(active, depth 0) at insn "next(it1)" > > - (b) it1(active, depth 0), it2(active, depth 0) at insn "next(it2)" > > > > And it is unclear which one should be processed next. > > Am I missing something? > > Not sure if such ordering issues a real issues or any of the > candidates could be picked. > Yes, my gut feeling was that if this idea works at all, then ordering for this won't matter. The question is if the idea itself is sound. Basically, I need to convince myself that subsequent iterations won't add any new read/precise marks. You are good at counter examples, so maybe you can come up with an example where input state into iteration #1 will get more precision marks only at iteration #2 or deeper. If that's the case, then this whole idea of postponing loop convergence checks probably doesn't work. > How about a more dumb solution which, imho, is easier to convince > oneself is correct (at-least from logical pov, not necessarily > implementation): substitute the goal of finding exact precision marks > with the goal of finding conservative precision marks. These marks > could be used instead of exact for states_equal() when .branches > 0. > > A straightforward way to compute such marks is to run a flow-sensitive > context/path-insensitive backwards DFA before do_check(). The result > of this DFA is a bitmask encoding which registers / stack slots may be > precise at each instruction. Information for instructions other than > is_iter_next_insn() could be discarded. > > Looking at the tests that are currently failing with my local fixes > (which force exact states comparions for .branches > 0) I think that > DFA based version would cover all of them. > > This sure adds some code to the verifier, however changing current > states traversal logic from DFS to combination of DFS+BFS is a > significant change as well. I don't think adding BFS sequencing is a lot of code. It's going to be a simple and small amount of code with potentially intricate behavior. But code-wise should be pretty minimal. > > Wdyt? I can't say I understand what exactly you are proposing and how you are going to determine these conservative precision marks. But I'd like to learn some new ideas, so please elaborate :)