Re: [Bug Report] bpf: incorrectly pruning runtime execution path

Eduard Zingerman <eddyz87@xxxxxxxxx> · Thu, 14 Dec 2023 02:08:56 +0200

On Wed, 2023-12-13 at 15:30 -0800, Andrii Nakryiko wrote:
[...]
> Yes, thanks, the execution trace above was helpful. Let's try to
> minimize the example here, I'll keep original instruction indices,
> though:
> 
>    23: (bf) r5 = r8                   ; here we link r5 and r8 together
>    26: (7e) if w8 s>= w0 goto pc+5    ; here it's not always/never
> taken, so w8 and w0 remain imprecise
>    28: (0f) r8 += r8                  ; here link between r8 and r5 is broken
>    29: (d6) if w5 s<= 0x1d goto pc+2  ; here we know value of w5 and
> so it's always/never taken, r5 is marked precise
> 
> Now, if we look at r5's precision log at this instruction:
> 
> 29: (d6) if w5 s<= 0x1d goto pc+2
> mark_precise: frame0: last_idx 29 first_idx 26 subseq_idx -1
> mark_precise: frame0: regs=r5 stack= before 28: (0f) r8 += r8
> mark_precise: frame0: regs=r5 stack= before 27: (4f) r8 |= r8
> mark_precise: frame0: regs=r5 stack= before 26: (7e) if w8 s>= w0 goto pc+5

Sorry, maybe it's time for me to get some sleep, but I don't see an
issue here. The "before" log is printed by backtrack_insn() before
instruction is backtracked. So the following:

> mark_precise: frame0: regs=r5 stack= before 26: (7e) if w8 s>= w0 goto pc+5

Is a state of backtracker before "if w8 s>= w0 ..." is processed.
But running the test case I've shared wider precision trace for
this instruction looks as follows:

  26: (7e) if w8 s>= w0 goto pc+5       ; R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0; 0x3))
                                          R8=scalar(id=2,smax32=1)
  27: (4f) r8 |= r8                     ; R8_w=scalar()
  28: (0f) r8 += r8                     ; R8_w=scalar()
  29: (d6) if w5 s<= 0x1d goto pc+2
  mark_precise: frame0: last_idx 29 first_idx 26 subseq_idx -1 
  mark_precise: frame0: regs=r5 stack= before 28: (0f) r8 += r8
  mark_precise: frame0: regs=r5 stack= before 27: (4f) r8 |= r8
  mark_precise: frame0: regs=r5 stack= before 26: (7e) if w8 s>= w0 goto pc+5
  mark_precise: frame0: parent state regs=r5 stack=: 
     R0_rw=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0; 0x3))
     R2_w=4
     R3_w=0x1f00000034
     R4_w=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,
                 var_off=(0x0; 0x3fffffffc))
     R5_rw=Pscalar(id=2)
     R8_rw=scalar(id=2) R10=fp0
  mark_precise: frame0: last_idx 24 first_idx 11 subseq_idx 26 
  mark_precise: frame0: regs=r5,r8 stack= before 24: (18) r2 = 0x4     <------ !!!
  mark_precise: frame0: regs=r5,r8 stack= before 23: (bf) r5 = r8
  mark_precise: frame0: regs=r8 stack= before 22: (67) r4 <<= 2
  ...

Note, that right after "if w8 s>= w0 goto pc+5" is processed the
backtracker state is:

  mark_precise: frame0: regs=r5,r8 stack= before 24: (18) r2 = 0x4

So both r5 and r8 are accounted for.

> Note how at this instruction r5 and r8 *WERE* linked together, but we
> already lost this information for backtracking. So we don't mark w8 as
> precise. That's one part of the problem.
> 
> The second part is that even if we knew that w8/r8 is precise, should
> we mark w0/r0 as precise? I actually need to think about this some
> more. Right now for conditional jumps we eagerly mark precision for
> both registers only in always/never taken scenarios.
> 
> For now just narrowing down the issue, as I'm sure not many people
> followed all the above stuff carefully.
> 
> 
> P.S. For solving tracking of linked registers we can probably utilize
> instruction history, though technically they can be spread across
> multiple frames, between registers and stack slots, so that's a bit
> tricky.

For precision back-propagation we currently rely on id values stored
in the parent state, when moving up from child to parent boundary.