On Fri, 2023-09-22 at 13:52 -0700, Andrii Nakryiko wrote: > > > I can't say I understand what exactly you are proposing and how you > > > are going to determine these conservative precision marks. But I'd > > > like to learn some new ideas, so please elaborate :) > > > > I'll send a follow-up email, trying to figure out what to do with pointers. > > Ok, sounds good. (below explains why DFA was a bad idea, please proceed to the next section if that's not interesting). My initial thought was that it would be possible to use a simple live-variable DFA in order to compute conservative read and precise marks. Like a textbook version: - split code in basic blocks; - for each instruction propagate liveness information backwards: - define USE(insn) to be a set of registers / stack slots read by this instruction; - define KILL(insn) to be a set of registers / stack slots written to by this instruction. - compute set of live variables before insn as: LIVE_in(insn) = USE(insn) U (LIVE_out(insn) / KILL(insn)) - compute basic block BB_out as a union of BB_in of each successor; - etc... iterate in postorder until fixed point is reached. However, this is a moot point because of presence of pointers. For series of instructions like: "r1 = r10; r1 += -8; r1 = *(u64 *)(r1 + 0);" Such algorithm would need to know possible range of values for r1. Meaning that before executing liveness DFA some sort of value range analysis would have to be run. And these things might get complicated [1] (I never implemented one, maybe it's not that bad). So, basically that would be a mini-copy of the verifier but implemented as a thing that computes fixed point instead of "tracing". Probably a no-go. [1] https://boxbase.org/entries/2017/oct/23/range_analysis_papers/ --- However, I think there are two more relatively simple options that should give correct results. Option A. "Exit or Loop" ------------------------ - Similar to Andrii's suggestion postpone exploration of bpf_iter_*_next() #N->#N+1 states for as long as there are other states to explore. - When evaluating is_state_visited() for bpf_iter_*_next() #N->#N+1 allow to use visited state with .branches > 0 to prune current state if: - states_equal() for visited and current states returns true; - all verfication paths starting from current instruction index with current iterator ref_obj_id either run to safe exit or run back to current instruction index with current iterator ref_obj_id. The last rule is needed to: - Ensure presence of read and precision marks in a state used for pruning. - Give a meaning to the precise scalars comparisons in regsafe() which I think is currently missing for bpf_iter_*_next() .branches > 0 case. Precise scalars comparison for visited states with .branches == 0 could be read as: "using any value from a specific range it is possible to reach safe exit". Precise scalars comparison for looping states from above could be read as: "using any value from a specific range it is possible to reach either safe exit or get back to this instruction". This algorithm should be able to handle simple iteration patterns like: bpf_iter_*_new(&it1, ...) while (!bpf_iter_*_next(&it1)) { ... } bpf_iter_*_destroy(&it1) And also handle nested iteration: bpf_iter_*_new(&it1, ...) while (!bpf_iter_*_next(&it1)) { bpf_iter_*_new(&it2, ...) while (!bpf_iter_*_next(&it2)) { ... } bpf_iter_*_destroy(&it2) } bpf_iter_*_destroy(&it1) But it would fail to handle more complex patterns, e.g. interleaved iteration: for (;;) { if (!bpf_iter_*_next(&it1)) // a break; if (!bpf_iter_*_next(&it2)) // b break; ... } For any state originating in (a) there would always be a state in (b) pending verification and vice versa. It would actually bail out even if it1 == it2. Option B. "Widening" -------------------- The trivial fix for current .branches > 0 states comparison is to force "exact" states comparisons for is_iter_next_insn() case: 1. Ignore liveness marks, as those are not finalized yet; 2. Ignore precision marks, as those are not finalized yet; 3. Use regs_exact() for scalars comparison. This, however, is very restrictive as it fails to verify trivial programs like (iters_looping/simplest_loop): sum = 0; bpf_iter_num_new(&it, 0, 10); while (!(r0 = bpf_iter_num_next())) sum += *(u32 *)r0; bpf_iter_num_destroy(&it); Here ever increasing bounds for "sum" prevent regs_exact() from ever returning true. One way to lift this limitation is to borrow something like the "widening" trick from the abstract interpretation papers mentioned earlier: - suppose there is current state C and a visited is_iter_next_insn() state V with .branches > 0; - suppose states_equal(V, C) returns true but exact states comparison returns false because there are scalar values not marked as precise that differ between V and C. - copy C as C' and mark these scalars as __mark_reg_unbounded() in C'; - run verification for C': - if verification succeeds -- problem solved; - if verification fails -- discard C' and proceed from C. Such algorithm should succeed for programs that don't use widened values in "precise" context. Looking at testcases failing with trivial fix I think that such limitations would be similar to those already present (e.g. the big comment in progs/iters.c:iter_obfuscate_counter would still apply). --- Option B appears to be simpler to implement so I'm proceeding with it. Will share progress on Monday. Please let me know if there are any obvious flaws with what I shared above. Thanks, Eduard