Re: [PATCH bpf-next 4/5] bpf: disambiguate SCALAR register state output in verifier logs

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Thu, 12 Oct 2023 09:22:23 -0700

On Wed, Oct 11, 2023 at 10:33 PM John Fastabend
<john.fastabend@xxxxxxxxx> wrote:
>
> Andrii Nakryiko wrote:
> > Currently the way that verifier prints SCALAR_VALUE register state (and
> > PTR_TO_PACKET, which can have var_off and ranges info as well) is very
> > ambiguous.
> >
> > In the name of brevity we are trying to eliminate "unnecessary" output
> > of umin/umax, smin/smax, u32_min/u32_max, and s32_min/s32_max values, if
> > possible. Current rules are that if any of those have their default
> > value (which for mins is the minimal value of its respective types: 0,
> > S32_MIN, or S64_MIN, while for maxs it's U32_MAX, S32_MAX, S64_MAX, or
> > U64_MAX) *OR* if there is another min/max value that as matching value.
> > E.g., if smin=100 and umin=100, we'll emit only umin=10, omitting smin
> > altogether. This approach has a few problems, being both ambiguous and
> > sort-of incorrect in some cases.
> >
> > Ambiguity is due to missing value could be either default value or value
> > of umin/umax or smin/smax. This is especially confusing when we mix
> > signed and unsigned ranges. Quite often, umin=0 and smin=0, and so we'll
> > have only `umin=0` leaving anyone reading verifier log to guess whether
> > smin is actually 0 or it's actually -9223372036854775808 (S64_MIN). And
> > often times it's important to know, especially when debugging tricky
> > issues.
>
> +1
>
> >
> > "Sort-of incorrectness" comes from mixing negative and positive values.
> > E.g., if umin is some large positive number, it can be equal to smin
> > which is, interpreted as signed value, is actually some negative value.
> > Currently, that smin will be omitted and only umin will be emitted with
> > a large positive value, giving an impression that smin is also positive.
> >
> > Anyway, ambiguity is the biggest issue making it impossible to have an
> > exact understanding of register state, preventing any sort of automated
> > testing of verifier state based on verifier log. This patch is
> > attempting to rectify the situation by removing ambiguity, while
> > minimizing the verboseness of register state output.
> >
> > The rules are straightforward:
> >   - if some of the values are missing, then it definitely has a default
> >   value. I.e., `umin=0` means that umin is zero, but smin is actually
> >   S64_MIN;
> >   - all the various boundaries that happen to have the same value are
> >   emitted in one equality separated sequence. E.g., if umin and smin are
> >   both 100, we'll emit `smin=umin=100`, making this explicit;
> >   - we do not mix negative and positive values together, and even if
> >   they happen to have the same bit-level value, they will be emitted
> >   separately with proper sign. I.e., if both umax and smax happen to be
> >   0xffffffffffffffff, we'll emit them both separately as
> >   `smax=-1,umax=18446744073709551615`;
> >   - in the name of a bit more uniformity and consistency,
> >   {u32,s32}_{min,max} are renamed to {s,u}{min,max}32, which seems to
> >   improve readability.
>
> agree.
>
> >
> > The above means that in case of all 4 ranges being, say, [50, 100] range,
> > we'd previously see hugely ambiguous:
> >
> >     R1=scalar(umin=50,umax=100)
> >
> > Now, we'll be more explicit:
> >
> >     R1=scalar(smin=umin=smin32=umin32=50,smax=umax=smax32=umax32=100)
> >
> > This is slightly more verbose, but distinct from the case when we don't
> > know anything about signed boundaries and 32-bit boundaries, which under
> > new rules will match the old case:
> >
> >     R1=scalar(umin=50,umax=100)
>
> Did you consider perhaps just always printing the entire set? Its overly
> verbose I guess but I find it easier to track state across multiple
> steps this way.

I didn't consider that because it's way too distracting and verbose
(IMO) in practice. For one, those default values represent the idea
"we don't know anything", so whether we see umax=18446744073709551615
or just don't see umax makes little difference in practice (though
perhaps one has to come to realization that those two things are
equivalent). But also think about seeing this:

smin=-9223372036854775807,smax=9223372036854775807,umin=0,umax=18446744073709551615,smin32=-2147483648,smax32=21474836487,umin32=0,umax32=4294967295

How verbose and distracting that is, and how much time would it take
you to notice that this is not just "we don't know anything about this
register", but that actually smin is not a default, it's S64_MIN+1.
This is of course extreme example (I mostly wanted to show how verbose
default output will be), but I think the point stands that omitting
defaults brings out what extra information we have much better.

It's an option to do it for LOG_LEVEL_2, but I would still not do
that, I'd find it too noisy even for level 2.

>
> Otherwise patch LGTM.