Re: [PATCH v5 bpf-next 2/3] bpf: Relax precision marking in open coded iters and may_goto loop.

Eduard Zingerman <eddyz87@xxxxxxxxx> · Thu, 06 Jun 2024 15:19:16 -0700

On Thu, 2024-06-06 at 13:02 -0700, Alexei Starovoitov wrote:

[...]

> > > +             reg1->var_off = tnum_unknown;
> > > +             reg2->var_off = tnum_unknown;
> > > +             break;
> > 
> > Just a random thought: suppose that one of the registers in question
> > is used as an index int the array of ints, and compiler increments it
> > using += 4. Would it be interesting to preserve alignment info in the
> > var_off in such case? (in other words, preserve known trailing zeros).
> 
> Well, the verifier cannot figure out which register is
> an induction variable. Compiler can generate a code where
> would be multiple such registers too.
> But even if it was one rX += 4
> it's nor clear how to figure out the size of the increment.
> 
> Also the above code is called at the time of comparison like "if 2 < 100".
> I figured I will try a heuristic at that time.
> See attached diff.
> It computes alignment of LHS and RHS and
> then heuristically adjusts the range.
> After spending all morning on it and various heuristics
> I'm convinced that this is a dead end.
> It cannot be made to work with i += 2 loops.

Summary of off-list discussion below.
For the following C code:

    long arr1[1024];

    SEC("socket")
    __success
    int test1(const void *ctx)
    {
        long i;

        for (i = 0; i < 1024 && can_loop; i++)
                arr1[i] = i;
        return 0;
    }

clang generates the following BPF code:

0000000000000340 <test1>:
     104:       r1 = 0x0
     105:       r2 = 0x0 ll

0000000000000358 <LBB28_1>:
     107:       may_goto +0x4 <LBB28_3>
     108:       *(u64 *)(r2 + 0x0) = r1
     109:       r2 += 0x8
     110:       r1 += 0x1
     111:       if r1 != 0x400 goto -0x5 <LBB28_1>

0000000000000380 <LBB28_3>:
     112:       w0 = 0x0
     113:       exit

Here r2 is used as an index and r1 as a counter.
Since r2 is never compared it is never widened.
Hence my point about preserving trailing zeros
for widened values is moot, as it won't really
help for real-life programs.