Re: [RFC bpf-next] bpf, verifier: improve signed ranges inference for BPF_AND

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2024-07-16 at 22:52 +0800, Shung-Hsi Yu wrote:

[...]

> To allow verification of such instruction pattern, update
> scalar*_min_max_and() to infer signed ranges directly from signed ranges
> of the operands. With BPF_AND, the resulting value always gains more
> unset '0' bit, thus it only move towards 0x0000000000000000. The
> difficulty lies with how to deal with signs. While non-negative
> (positive and zero) value simply grows smaller, a negative number can
> grows smaller, but may also underflow and become a larger value.
> 
> To better address this situation we split the signed ranges into
> negative range and non-negative range cases, ignoring the mixed sign
> cases for now; and only consider how to calculate smax_value.
> 
> Since negative range & negative range preserve the sign bit, so we know
> the result is still a negative value, thus it only move towards S64_MIN,
> but never underflow, thus a save bet is to use a value in ranges that is
> closet to 0, thus "max(dst_reg->smax_value, src->smax_value)". For
> negative range & positive range the sign bit is always cleared, thus we
> know the resulting is a non-negative, and only moves towards 0, so a
> safe bet is to use smax_value of the non-negative range. Last but not
> least, non-negative range & non-negative range is still a non-negative
> value, and only moves towards 0; however same as the unsigned range
> case, the maximum is actually capped by the lesser of the two, and thus
> min(dst_reg->smax_value, src_reg->smax_value);
> 
> Listing out the above reasoning as a table (dst_reg abbreviated as dst,
> src_reg abbreviated as src, smax_value abbrivated as smax) we get:
> 
>                         |                         src_reg
>        smax = ?         +---------------------------+---------------------------
>                         |        negative           |       non-negative
> ---------+--------------+---------------------------+---------------------------
>          | negative     | max(dst->smax, src->smax) |         src->smax
> dst_reg  +--------------+---------------------------+---------------------------
>          | non-negative |         dst->smax         | min(dst->smax, src->smax)
> 
> However this is quite complicated, luckily it can be simplified given
> the following observations
> 
>     max(dst_reg->smax_value, src_reg->smax_value) >= src_reg->smax_value
>     max(dst_reg->smax_value, src_reg->smax_value) >= dst_reg->smax_value
>     max(dst_reg->smax_value, src_reg->smax_value) >= min(dst_reg->smax_value, src_reg->smax_value)
> 
> So we could substitute the cells in the table above all with max(...),
> and arrive at:
> 
>                         |                         src_reg
>       smax' = ?         +---------------------------+---------------------------
>                         |        negative           |       non-negative
> ---------+--------------+---------------------------+---------------------------
>          | negative     | max(dst->smax, src->smax) | max(dst->smax, src->smax)
> dst_reg  +--------------+---------------------------+---------------------------
>          | non-negative | max(dst->smax, src->smax) | max(dst->smax, src->smax)
> 
> Meaning that simply using
> 
>   max(dst_reg->smax_value, src_reg->smax_value)
> 
> to calculate the resulting smax_value would work across all sign combinations.
> 
> 
> For smin_value, we know that both non-negative range & non-negative
> range and negative range & non-negative range both result in a
> non-negative value, so an easy guess is to use the minimum non-negative
> value, thus 0.
> 
>                         |                         src_reg
>        smin = ?         +----------------------------+---------------------------
>                         |          negative          |       non-negative
> ---------+--------------+----------------------------+---------------------------
>          | negative     |             ?              |             0
> dst_reg  +--------------+----------------------------+---------------------------
>          | non-negative |             0              |             0
> 
> This leave the negative range & negative range case to be considered. We
> know that negative range & negative range always yield a negative value,
> so a preliminary guess would be S64_MIN. However, that guess is too
> imprecise to help with the r0 <<= 62, r0 s>>= 63, r0 &= -13 pattern
> we're trying to deal with here.
> 
> This can be further improve with the observation that for negative range
> & negative range, the smallest possible value must be one that has
> longest _common_ most-significant set '1' bits sequence, thus we can use
> min(dst_reg->smin_value, src->smin_value) as the starting point, as the
> smaller value will be the one with the shorter most-significant set '1'
> bits sequence. But that alone is not enough, as we do not know whether
> rest of the bits would be set, so the safest guess would be one that
> clear alls bits after the most-significant set '1' bits sequence,
> something akin to bit_floor(), but for rounding to a negative power-of-2
> instead.
> 
>     negative_bit_floor(0xffff000000000003) == 0xffff000000000000
>     negative_bit_floor(0xf0ff0000ffff0000) == 0xf000000000000000
>     negative_bit_floor(0xfffffb0000000000) == 0xfffff80000000000
> 
> With negative range & negative range solve, we now have:
> 
>                         |                         src_reg
>        smin = ?         +----------------------------+---------------------------
>                         |        negative            |       non-negative
> ---------+--------------+----------------------------+---------------------------
>          |   negative   |negative_bit_floor(         |             0
>          |              |  min(dst->smin, src->smin))|
> dst_reg  +--------------+----------------------------+---------------------------
>          | non-negative |           0                |             0
> 
> This can be further simplied since min(dst->smin, src->smin) < 0 when both
> dst_reg and src_reg have a negative range. Which means using
> 
>     negative_bit_floor(min(dst_reg->smin_value, src_reg->smin_value)
> 
> to calculate the resulting smin_value would work across all sign combinations.
> 
> Together these allows us to infer the signed range of the result of BPF_AND
> operation using the signed range from its operands.


Hi Shung-Hsi,

This seems quite elegant.
As an additional check, I did a simple brute-force for all possible
ranges of 6-bit integers and bounds are computed safely.

[...]





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux