On Thu, 2023-09-28 at 11:30 -0700, Andrii Nakryiko wrote: [...] > > The way I read it the following algorithm should suffice: > > - add a field bpf_verifier_env::iter_head similar to 'head' but for > > postponed looping states; > > - add functions push_iter_stack(), pop_iter_stack() similar to > > push_stack() and pop_stack(); > > I don't like the suggested naming, it's too iter-centric, and it's > actually a queue, not a stack, etc. But that's something to figure out > later. After spending some time I think I figured out an example that shows that postponed looping states do not form a queue. Please bear with me, as it is quite large: 1. j = iter_new(); // fp[-16] 2. a = 0; // r6 3. b = 0; // r7 4. c = -24; // r8 5. while (iter_next(j)) { 6. i = iter_new(); 7. a = 0; 8. b = 0; 9. while (iter_next(i)) { 10. if (a == 1) { 11. a = 0; 12. b = 1; 13. } else if (a == 0) { 14. a = 1; 15. if (random() == 42) 16. continue; 17. if (b == 1) { 18. *(r10 + c) = 7; 19. iter_destroy(i); 20. iter_destroy(j); 21. return; 22. } 23. } 24. } 25. iter_destroy(i); 26. a = 0; 27. b = 0; 28. c = -25; 29. } 30. iter_destroy(j); 31. return; This example is unsafe, because on second iteration of loop 5-29 the value of 'c' is -25 (assignment at 28) and 'c' is used for stack access at 18, offset of -25 is misaligned. The basic idea is to: (a) hide the fact that (c) is precise behind a postponed state; (b) hide the unsafe (c) value behind a postponed state. The (b) is achieved on first iteration of the loop 5-29: enter 5 in state with c=-24 not precise, 'i' is initially considered drained thus verification path is 5-9,25-29,5. Upon second visit to 5 the state is postponed because c=-24 is considered equal to c=-25 (both are not precise). State at first visit of 5: R0=scalar() R1_rw=fp-16 R6=0 R7=0 R8=-24 R10=fp0 fp-16_r=iter_num(ref_id=1,state=active,depth=0) refs=1 State at second visit of 5 (postponed, let's call it P1): R1_w=fp-16 R6=0 R7=0 R8=-25 R10=fp0 fp-16=iter_num(ref_id=1,state=active,depth=1) refs=1 The (a) is achieved by a series of jumps in the inner loop 9-24: - first visit of 9 is in state {a=0,b=0,c=XX} (here XX is a value of C and it is either -24 or -25, depending on outer loop iteration number); - second visit of 9 is in state {a=1P,b=0,c=XX} (path 9,14-16,9; at this point 'a' is marked precise because of prediction at 13); - third visit of 9 is in state {a=0P,b=1,c=XX} (path 9-12,9; 'b' is not yet marked as precise and thus this state is equal to the first one and is postponed). - after this verifier visits 17, predicts condition as false and marks 'b' as precise, but {a=1P,b=1,c=XX} is already postponed. E.g. the following state is postponed at first iteration, let's call it P2: R0=... R1_w=fp-8 R6=0 R7=1 R8=-24 R10=fp0 fp-8=iter_num(ref_id=3,state=active,depth=2) fp-16=iter_num(ref_id=1,state=active,depth=1) refs=1,3 If verification were to proceed from state P2 {a=0P,b=1P,c=XX} at 9 the path would be 9,13-14,17-21 and 'c' would be marked as precise. But if we process postponed states as a queue P1 would be processed first, 'c' would not have a precision mark yet and P1 would be discarded => P2 with R8=-25 would never be investigated. I converted this test case to assembly [2] and tried with a prototype [1] to make sure that I don't miss anything. --- In general, states form a tree during verification and we want to stop the tree growth when we see a would be back-edge to a similar state. Consider the following state graph: .---> S1 <---. Here there is a chain of derived states: | | | S1 -> S2 -> S3 | v | | S2 -> S2' And there are potentially looping states: | | - S2' (derived from S2, states_equal to S1) | v - S3' (derived from S3, states_equal to S1) S3' <- S3 If we find that S2' is no longer states_equal to S1 it might add a derivation that would eventually propagate read or precision mark to some register in S1. Same is true for S3'. Thus there is no clear order in which S2' and S3' should be processed. It is possible to imagine more complex state graphs. Unfortunately, preparing real BPF examples for this graphs is very time consuming (at-least for me). Hence, I think that a fixed point computation is necessary as I write in a sibling email to Alexei: - Process states from env->head while those are available, in case if potential looping state (is_states_equal()) is reached put it to a separate queue. - Once all env->head states are processed the only source for new read and precision marks is in postponed looping states, some of which might not be is_states_equal() anymore. Submit each such state for verification until fixed point is reached (repeating steps for env->head processing). The work-in-progress implementation for this algorithm is in [1]. --- While working on this I found an unexpected interaction with infinite loop detection for the following test case: // any branch if (random() > 42) a = 0; else a = 1; // iterator loop i = iter_new() while (iter_next(i)) // do something When first branch of 'if' is processed a looping state for 'while' is added to a postponed states queue. Then second branch of 'if' is processed and reaches 'iter_new()', at which point infinite loop detection logic sounds an alarm: if (states_maybe_looping(&sl->state, cur) && states_equal(env, &sl->state, cur) && !iter_active_depths_differ(&sl->state, cur) && false) { verbose_linfo(env, insn_idx, "; "); verbose(env, "infinite loop detected at insn %d\n", insn_idx); return -EINVAL; } The state with iter_new() is a parent for a looping state thus has it's .branches counter as non-zero, and iterator depth is 0 for both hands of the 'if'. Adding such states to the looping states queue is not safe, as there is no guarantee this arbitrary state would terminate (opposed to iter_next() for which we are guaranteed to have 'drained' branch). Thus, the only fix that I see now is to disable infinite loop detection if there are active iterators. (Might do some trick with remembering this state are printing it if instructions limit counter is reached, but I'm not sure if added complexity is worth it, e.g. what to do if there are several such states?) --- [1] https://github.com/eddyz87/bpf/tree/iters-bug-delayed-traversal [2] https://github.com/eddyz87/bpf/blob/iters-bug-delayed-traversal/tools/testing/selftests/bpf/progs/iters.c#L860 Thanks, Eduard