Andrii Nakryiko wrote: > On Wed, Oct 11, 2023 at 10:33 PM John Fastabend > <john.fastabend@xxxxxxxxx> wrote: > > > > Andrii Nakryiko wrote: > > > Currently the way that verifier prints SCALAR_VALUE register state (and > > > PTR_TO_PACKET, which can have var_off and ranges info as well) is very > > > ambiguous. > > > > > > In the name of brevity we are trying to eliminate "unnecessary" output > > > of umin/umax, smin/smax, u32_min/u32_max, and s32_min/s32_max values, if > > > possible. Current rules are that if any of those have their default > > > value (which for mins is the minimal value of its respective types: 0, > > > S32_MIN, or S64_MIN, while for maxs it's U32_MAX, S32_MAX, S64_MAX, or > > > U64_MAX) *OR* if there is another min/max value that as matching value. > > > E.g., if smin=100 and umin=100, we'll emit only umin=10, omitting smin > > > altogether. This approach has a few problems, being both ambiguous and > > > sort-of incorrect in some cases. > > > > > > Ambiguity is due to missing value could be either default value or value > > > of umin/umax or smin/smax. This is especially confusing when we mix > > > signed and unsigned ranges. Quite often, umin=0 and smin=0, and so we'll > > > have only `umin=0` leaving anyone reading verifier log to guess whether > > > smin is actually 0 or it's actually -9223372036854775808 (S64_MIN). And > > > often times it's important to know, especially when debugging tricky > > > issues. > > > > +1 > > > > > > > > "Sort-of incorrectness" comes from mixing negative and positive values. > > > E.g., if umin is some large positive number, it can be equal to smin > > > which is, interpreted as signed value, is actually some negative value. > > > Currently, that smin will be omitted and only umin will be emitted with > > > a large positive value, giving an impression that smin is also positive. > > > > > > Anyway, ambiguity is the biggest issue making it impossible to have an > > > exact understanding of register state, preventing any sort of automated > > > testing of verifier state based on verifier log. This patch is > > > attempting to rectify the situation by removing ambiguity, while > > > minimizing the verboseness of register state output. > > > > > > The rules are straightforward: > > > - if some of the values are missing, then it definitely has a default > > > value. I.e., `umin=0` means that umin is zero, but smin is actually > > > S64_MIN; > > > - all the various boundaries that happen to have the same value are > > > emitted in one equality separated sequence. E.g., if umin and smin are > > > both 100, we'll emit `smin=umin=100`, making this explicit; > > > - we do not mix negative and positive values together, and even if > > > they happen to have the same bit-level value, they will be emitted > > > separately with proper sign. I.e., if both umax and smax happen to be > > > 0xffffffffffffffff, we'll emit them both separately as > > > `smax=-1,umax=18446744073709551615`; > > > - in the name of a bit more uniformity and consistency, > > > {u32,s32}_{min,max} are renamed to {s,u}{min,max}32, which seems to > > > improve readability. > > > > agree. > > > > > > > > The above means that in case of all 4 ranges being, say, [50, 100] range, > > > we'd previously see hugely ambiguous: > > > > > > R1=scalar(umin=50,umax=100) > > > > > > Now, we'll be more explicit: > > > > > > R1=scalar(smin=umin=smin32=umin32=50,smax=umax=smax32=umax32=100) > > > > > > This is slightly more verbose, but distinct from the case when we don't > > > know anything about signed boundaries and 32-bit boundaries, which under > > > new rules will match the old case: > > > > > > R1=scalar(umin=50,umax=100) > > > > Did you consider perhaps just always printing the entire set? Its overly > > verbose I guess but I find it easier to track state across multiple > > steps this way. > > I didn't consider that because it's way too distracting and verbose > (IMO) in practice. For one, those default values represent the idea > "we don't know anything", so whether we see umax=18446744073709551615 > or just don't see umax makes little difference in practice (though > perhaps one has to come to realization that those two things are > equivalent). But also think about seeing this: > > smin=-9223372036854775807,smax=9223372036854775807,umin=0,umax=18446744073709551615,smin32=-2147483648,smax32=21474836487,umin32=0,umax32=4294967295 you could do, smin=SMIN,smax=SMAX,umin=0,umax=UMAX,smin=SMIN,smax=SMAX,umin32=0,umax32=UMAX but I see your point. > > How verbose and distracting that is, and how much time would it take > you to notice that this is not just "we don't know anything about this > register", but that actually smin is not a default, it's S64_MIN+1. > This is of course extreme example (I mostly wanted to show how verbose > default output will be), but I think the point stands that omitting > defaults brings out what extra information we have much better. > > It's an option to do it for LOG_LEVEL_2, but I would still not do > that, I'd find it too noisy even for level 2. My $.02 just leave it as you have it here. > > > > > Otherwise patch LGTM.