On Thu, Jun 1, 2023 at 9:57 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > On Wed, 2023-05-31 at 19:05 -0700, Alexei Starovoitov wrote: > > [...] > > > Suppose that current verification path is 1-7: > > > - On a way down 1-6 r7 will not be marked as precise, because > > > condition (r7 > X) is not predictable (see check_cond_jmp_op()); > > > - When (7) is reached mark_chain_precision() will start moving up > > > marking the following registers as precise: > > > > > > 4: if (r6 > r7) goto +1 ; r6, r7 > > > 5: r7 = r6 ; r6 > > > 6: if (r7 > X) goto ... ; r6 > > > 7: r9 += r6 ; r6 > > > > > > - Thus, if checkpoint is created for (6) r7 would be marked as read, > > > but will not be marked as precise. > > > > > > Next, suppose that jump from 4 to 6 is verified and checkpoint for (6) > > > is considered: > > > - r6 is not precise, so check_ids() is not called for it and it is not > > > added to idmap; > > > - r7 is precise, so check_ids() is called for it, but it is a sole > > > register in the idmap; > > > > typos in above? > > r6 is precise and r7 is not precise. > > Yes, it should be the other way around in the description: > r6 precise, r7 not precise. Sorry for confusion. > > > > - States are considered equal. > > > > > > Here is the log (I added a few prints for states cache comparison): > > > > > > from 10 to 13: safe > > > steq hit 10, cur: > > > R0=scalar(id=2) R6=scalar(id=2) R7=scalar(id=1) R9=fp-8 R10=fp0 fp-8=00000000 > > > steq hit 10, old: > > > R6_rD=Pscalar(id=2) R7_rwD=scalar(id=2) R9_rD=fp-8 R10=fp0 fp-8_rD=00000000 > > > > the log is correct, thouhg. > > r6_old = Pscalar which will go through check_ids() successfully and both are unbounded. > > r7_old is not precise. different id-s don't matter and different ranges don't matter. > > > > As another potential fix... > > can we mark_chain_precision() right at the time of R1 = R2 when we do > > src_reg->id = ++env->id_gen > > and copy_register_state(); > > for both regs? > > This won't help, e.g. for the original example precise markings would be: > > 4: if (r6 > r7) goto +1 ; r6, r7 > 5: r7 = r6 ; r6, r7 > 6: if (r7 > X) goto ... ; r6 <-- mark for r7 is still missing > 7: r9 += r6 ; r6 Because 6 is a new state and we do mark_all_scalars_imprecise() after 5 ? > What might help is to call mark_chain_precision() from > find_equal_scalars(), but I expect this to be very expensive. maybe worth giving it a shot? > > I think > > if (rold->precise && !check_ids(rold->id, rcur->id, idmap)) > > would be good property to have. > > I don't like u32_hashset either. > > It's more or less saying that scalar id-s are incompatible with precision. > > > > I hope we don't need to do: > > + u32 reg_ids[MAX_CALL_FRAMES]; > > for backtracking either. > > Hacking id-s into jmp history is equally bad. > > > > Let's figure out a minimal fix. > > Solution discussed with Andrii yesterday seems to work. The thread is long. Could you please describe it again in pseudo code? > There is still a performance regression, but much less severe: > > $ ./veristat -e file,prog,states -f "states_pct>5" -C master-baseline.log current.log > File Program States (A) States (B) States (DIFF) > ------------------------ ------------------------------ ---------- ---------- ----------------- > bpf_host.o cil_to_host 188 198 +10 (+5.32%) > bpf_host.o tail_handle_ipv4_from_host 225 243 +18 (+8.00%) > bpf_host.o tail_ipv6_host_policy_ingress 98 104 +6 (+6.12%) > bpf_xdp.o tail_handle_nat_fwd_ipv6 648 806 +158 (+24.38%) > bpf_xdp.o tail_lb_ipv4 2491 2930 +439 (+17.62%) > bpf_xdp.o tail_nodeport_nat_egress_ipv4 749 868 +119 (+15.89%) > bpf_xdp.o tail_nodeport_nat_ingress_ipv4 375 477 +102 (+27.20%) > bpf_xdp.o tail_rev_nodeport_lb4 398 486 +88 (+22.11%) > loop6.bpf.o trace_virtqueue_add_sgs 226 251 +25 (+11.06%) > pyperf600.bpf.o on_event 22200 45095 +22895 (+103.13%) > pyperf600_nounroll.bpf.o on_event 34169 37235 +3066 (+8.97%) > > I need to add a bunch of tests and take a look at pyperf600.bpf.o > before submitting next patch-set version. Great. Looking forward.