Re: [PATCH v2] bpf, verifier: Improve precision of BPF_MUL

M Shachnai <m.shachnai@xxxxxxxxx> · Mon, 2 Dec 2024 18:31:11 -0500

On Wed, Nov 27, 2024 at 5:53 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
>
> On Wed, 2024-11-27 at 02:41 -0500, Matan Shachnai wrote:
>
> [...]
>
> > In conclusion, with this patch,
> >
> > 1. We were able to show that we can improve the overall precision of
> >    BPF_MUL. We proved (using an SMT solver) that this new version of
> >    BPF_MUL is at least as precise as the current version for all inputs.
> >
> > 2. We are able to prove the soundness of the new scalar_min_max_mul() and
> >    scalar32_min_max_mul(). By leveraging the existing proof of tnum_mul
> >    [1], we can say that the composition of these three functions within
> >    BPF_MUL is sound.
>
> Hi Matan,
>
> I think this is a nice simplification of the existing code.
> Could you please also add a few canary tests in the
> tools/testing/selftests/bpf/progs/verifier_bounds.c ?
> (e.g. simple case plus possible edge cases).

Thanks for your feedback, Eduard! We'll be happy to add test-cases to
exercise BPF_MUL.

> Something like:
>
>     SEC("tc")
>     __success __log_level(2)
>     __msg("r6 *= r7 {{.*}}; R6_w=some-range-here")
>     __naked void mult_mixed_sign(void)
>     {
>         asm volatile (
>         "call %[bpf_get_prandom_u32];"
>         "r6 = r0;"
>         "call %[bpf_get_prandom_u32];"
>         "r7 = r0;"
>         "r6 &= 0xf;"
>         "r6 -= 1000000000;"
>         "r7 &= 0xf;"
>         "r7 -= 2000000000;"
>         "r6 *= r7;"
>         "exit"
>         :
>         : __imm(bpf_get_prandom_u32),
>           __imm(bpf_skb_store_bytes)
>         : __clobber_all);
>     }
>
> We usually do this as a separate patch in a patch-set.
>
> Also, it looks like this has limited applicability in practice,
> because small negative values denote huge unsigned values,
> hence overflow check kicks in for such values.
> E.g. no range inferred for [-10,5] * [-20,-5]:
>
>   0: (85) call bpf_get_prandom_u32#7    ; R0_w=scalar()
>   1: (bf) r6 = r0                       ; R0_w=scalar(id=1) R6_w=scalar(id=1)
>   2: (85) call bpf_get_prandom_u32#7    ; R0_w=scalar()
>   3: (bf) r7 = r0                       ; R0_w=scalar(id=2) R7_w=scalar(id=2)
>   4: (57) r6 &= 15                      ; R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=15,var_off=(0x0; 0xf))
>   5: (17) r6 -= 10                      ; R6_w=scalar(smin=smin32=-10,smax=smax32=5)
>   6: (57) r7 &= 15                      ; R7_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=15,var_off=(0x0; 0xf))
>   7: (17) r7 -= 20                      ; R7_w=scalar(smin=smin32=-20,smax=smax32=-5,umin=0xffffffffffffffec,umax=0xfffffffffffffffb,umin32=0xffffffec,umax32=0xfffffffb,var_off=(0xffffffffffffffe0; 0x1f))
>   8: (2f) r6 *= r7                      ; R6_w=scalar() R7_w=scalar(smin=smin32=-20,smax=smax32=-5,umin=0xffffffffffffffec,umax=0xfffffffffffffffb,umin32=0xffffffec,umax32=0xfffffffb,var_off=(0xffffffffffffffe0; 0x1f))
>   9: (95) exit
>
> Compared to:
>
>   0: R1=ctx() R10=fp0
>   ; asm volatile ( @ verifier_bounds.c:1208
>   0: (85) call bpf_get_prandom_u32#7    ; R0_w=scalar()
>   1: (bf) r6 = r0                       ; R0_w=scalar(id=1) R6_w=scalar(id=1)
>   2: (85) call bpf_get_prandom_u32#7    ; R0_w=scalar()
>   3: (bf) r7 = r0                       ; R0_w=scalar(id=2) R7_w=scalar(id=2)
>   4: (57) r6 &= 15                      ; R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=15,var_off=(0x0; 0xf))
>   5: (17) r6 -= 1000000000              ; R6_w=scalar(smin=0xffffffffc4653600,smax=0xffffffffc465360f,umin=0xffffffffc4653600,umax=0xffffffffc465360f,smin32=umin32=0xc4653600,smax32=umax32=0xc465360f,var_off=(0xffffffffc4653600; 0xf))
>   6: (57) r7 &= 15                      ; R7_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=15,var_off=(0x0; 0xf))
>   7: (17) r7 -= 2000000000              ; R7_w=scalar(smin=0xffffffff88ca6c00,smax=0xffffffff88ca6c0f,umin=0xffffffff88ca6c00,umax=0xffffffff88ca6c0f,smin32=umin32=0x88ca6c00,smax32=umax32=0x88ca6c0f,var_off=(0xffffffff88ca6c00; 0xf))
>   8: (2f) r6 *= r7                      ; R6_w=scalar(smax=0x7ffffffffffffeff,umax=0xfffffffffffffeff,smax32=0x7ffffeff,umax32=0xfffffeff,var_off=(0x0; 0xfffffffffffffeff)) R7_w=scalar(smin=0xffffffff88ca6c00,smax=0xffffffff88ca6c0f,umin=0xffffffff88ca6c00,umax=0xffffffff88ca6c0f,smin32=umin32=0x88ca6c00,smax32=umax32=0x88ca6c0f,var_off=(0xffffffff88ca6c00; 0xf))
>   9: (95) exit
>
> Is it possible to do check_mul_overflow() for signed bounds and
> rely on reg_bounds_sync() for unsigned?
>

The patch in its current form (and the existing BPF_MUL version in the
verifier) doesn't handle negative values well, as the example you gave
here illustrates. The initial goal of this patch was to improve
precision of unsigned multiplication. However, there is a canonical
way to perform signed multiplication which is sound and is able to
handle negative values. Specifically, signed multiplication can be
performed soundly by [min(a, b, c, d), max(a, b, c, d)], where a, b,
c, d correspond to the four products obtained by multiplying all the
bounds (these products are checked for overflows). For better
precision, we propose having both unsigned multiplication as well as
signed multiplication. The resulting bounds can then be refined in
reg_bounds_sync().

We will update our patch with both signed and unsigned multiplication,
add test-cases, and send it all as a patch-set soon.

Best,
Matan

> [...]
>