On Wed, May 29, 2024 at 3:14 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > On Tue, 2024-05-28 at 20:22 -0700, Alexei Starovoitov wrote: > > [...] > > > > > > > However, below is an example where if comparison is BPF_X. > > > Note that I obfuscated constant 5 as a volatile variable. > > > And here is what happens when verifier rejects the program: > > > > Sounds pretty much like: doctor it hurts when I do that. > > Well, the point is not in the volatile variable but in the BPF_X > comparison instruction. The bound might a size of some buffer, > e.g. encoded like this: > > struct foo { > int *items; > int max_items; // suppose this is 5 for some verification path > }; // and 7 for another. > > And you don't need bpf_for specifically, an outer loop with > can_loop should also lead to get_loop_entry(...) being non-NULL. Right. Open coded iters and can_loop have the same convergence property. They both need loop iterations to look like infinite loop (same states) while bounded loop logic needs the opposite. It needs loop states to be different otherwise it's an infinite loop and prog is rejected. So it's simply impossible for the verifier to excel in both. The heuristic I'm adding is trying to make it work well for open coded iters and focusing on converging when loop count is high and impractical for a bounded loop. At the same time the heuristic is trying not to degrade a corner case where bounded loop logic might help (by recognizing const_dst ?= const_src and not widening equality operation). But heuristic is not going to be perfect. As I said it's two steps forward and minimal step back. I'm adding the following selftests: volatile const int limit = 100000; /* global */ while ((v = bpf_iter_num_next(&it))) { if (i < limit) sum += arr[i++]; } influenced by your test and it passes due to widening logic. Such a test is an impossible task for a bounded loop. Hence this step forward for open coded iters is much bigger than corner case loss of bounded loop logic inside open coded iter. > > > + volatile unsigned long five = 5; > > > + unsigned long sum = 0, i = 0; > > > + struct bpf_iter_num it; > > > + int *v; > > > + > > > + bpf_iter_num_new(&it, 0, 10); > > > + while ((v = bpf_iter_num_next(&it))) { > > > + if (i < five) > > > + sum += arr[i++]; > > > > If you're saying that the verifier should accept that > > no matter what then I have to disagree. > > Not interested in avoiding issues in programs that > > are actively looking to explore a verifier implementation detail. > > I don't think that this is a very exotic pattern, > such code could be written if one has a buffer with a dynamic bound > and seeks to fill it with items from some collection applying filtering. After thinking more I had to agree. Hence I added it as a test. volatile on stack is exotic and impractical, but volatile const int limit; as global var is a very real use case. My widening logic was indeed too broad for BPF_X case. reg_set_min_max() will see two unknown scalars and there will be nothing to learn bounds from. I'm fixing it with a double call to reg_set_min_max(). Once for src and other_src and 2nd time for dst and other_dst. This way "i < five" is properly widened for both registers. > I do not insist that varifier should accept such programs, > but since we are going for heuristics to do the widening, > I think we should try and figure out a few examples when > heuristics breaks, just to understand if that is ok. btw the existing widen_imprecise_scalars() heuristic for open coded iters is more or less the same thing. It may hurt bounded loop logic too. A bit of hand waving: while ((v = bpf_iter_num_next(&it))) { if (i < 100) { j++; i++; // use 'j' such that it's safe only when 'j' is precise. } Without this patch 'i < 100' will force precision on 'i', but widen_imprecise_scalars() might widen 'j'. At the end the users would need to understand how the verifier widened 'i < 100' to write progs with open coded iters and normal maps. Or they will switch to may_goto and arena and things will "just work". In arena progs we currently do a ton of useless mark_chain_precision work because of 'i < 100' conditions. When all other pointers are arena pointers they don't trigger precision marking. Only 'i < 100' do. So after this patch the verifier is doing a lot less work for arena programs. That's another reason why I want to remove precision marking after is_branch_taken. Anyway, v4 will be soon.