On Thu, Oct 26, 2023 at 10:44 PM Shung-Hsi Yu <shung-hsi.yu@xxxxxxxx> wrote: > > On Mon, Oct 23, 2023 at 09:14:08PM +0800, Shung-Hsi Yu wrote: > > Hi, > > > > CC those who had worked on bound tracking before for feedbacks, as well as > > Dave who works on PREVAIL (verifier used on Windows) and Paul who've written > > about PREVAIL[1], for whether there's existing knowledge on this topic. > > > > Here goes a long one... > > > > --- > > > > While looking at Andrii's patches that improves bounds logic (specifically > > patches [2][3]). I realize we may be able to unify umin/umax/smin/smax into > > just two u64. Not sure if this has already been discussed off-list or is > > being worked upon, but I can't find anything regarding this by searching > > within the BPF mailing list. > > > > For simplicity sake I'll focus on unsigned bounds for now. What we have > > right in the Linux Kernel now is essentially > > > > struct bounds { > > u64 umin; > > u64 umax; > > } > > > > We can visualize the above as a number line, using asterisk to denote the > > values between umin and umax. > > > > u64 > > |----------********************--| > > > > Say if we have umin = A, and umax = B (where B > 2^63). Representing the > > magnitude of umin and umax visually would look like this > > > > <----------> A > > |----------********************--| > > <-----------------------------> B (larger than 2^63) > > > > Now if we go through a BPF_ADD operation and adds 2^(64 - 1) = 2^63, > > currently the verifier will detect that this addition overflows, and thus > > reset umin and umax to 0 and U64_MAX, respectively; blowing away existing > > knowledge. > > > > |********************************| > > > > Had we used u65 (1-bit more than u64) and tracks the bound with u65_min and > > u65_max, the verifier would have captured the bound just fine. (This idea > > comes from the special case mentioned in Andrii's patch[3]) > > > > u65 > > <---------------> 2^63 > > <----------> A > > <--------------------------> u65_min = A + 2^63 > > |--------------------------********************------------------| > > <---------------------------------------------> u65_max = B + 2^63 > > > > Continue on this thought further, let's attempting to map this back to u64 > > number lines (using two of them to fit everything in u65 range), it would > > look like > > > > u65 > > |--------------------------********************------------------| > > vvvvvvvvvvvvvvvvvvvv > > |--------------------------******|*************------------------| > > u64 u64 > > > > And would seems that we'd need two sets of u64 bounds to preserve our > > knowledge. > > > > |--------------------------******| u64 bound #1 > > |**************------------------| u64 bound #2 > > > > Or just _one_ set of u64 bound if we somehow are able to track the union of > > bound #1 and bound #2 at the same time > > > > |--------------------------******| u64 bound #1 > > U |**************------------------| u64 bound #2 > > vvvvvvvvvvvvvv vvvvvv union on the above bounds > > |**************------------******| > > > > However, this bound crosses the point between U64_MAX and 0, which is not > > semantically possible to represent with the umin/umax approach. It just > > makes no sense. > > > > |**************------------******| union of bound #1 and bound #2 > > > > The way around this is that we can slightly change how we track the bounds, > > and instead use > > > > struct bounds { > > u64 base; /* base = umin */ > > /* Maybe there's a better name other than "size" */ > > u64 size; /* size = umax - umin */ > > } > > > > Using this base + size approach, previous old bound would have looked like > > > > <----------> base = A > > |----------********************--| > > <------------------> size = B - A > > > > Looking at the bounds this way means we can now capture the union of bound > > #1 and bound #2 above. Here it is again for reference > > > > |**************------------******| union of bound #1 and bound #2 > > > > Because registers are u64-sized, they wraps, and if we extend the u64 number > > line, it would look like this due to wrapping > > > > u64 same u64 wrapped > > |**************------------******|*************------------******| > > > > Which can be capture with the base + size semantic > > > > <--------------------------> base = (u64) A + 2^63 > > |**************------------******|*************------------******| > > <------------------> size = B - A, > > doesn't change after add > > > > Or looking it with just a single u64 number line again > > > > <--------------------------> base = (u64) A + 2^63 > > |**************------------******| > > <-------------> base + size = (u64) (B + 2^32) > > > > This would mean that umin and umax is no longer readily available, we now > > have to detect whether base + size wraps to determin whether umin = 0 or > > base (and similar for umax). But the verifier already have the code to do > > that in the existing scalar_min_max_add(), so it can be done by reusing > > existing code. > > > > --- > > > > Side tracking slightly, a benefit of this base + size approach is that > > scalar_min_max_add() can be made even simpler: > > > > scalar_min_max_add(struct bpf_reg_state *dst_reg, > > struct bpf_reg_state *src_reg) > > { > > /* This looks too simplistic to have worked */ > > dst_reg.base = dst_reg.base + src_reg.base; > > dst_reg.size = dst_reg.size + src_reg.size; > > } > > > > Say we now have another unsigned bound where umin = C and umax = D > > > > <--------------------> C > > |--------------------*********---| > > <----------------------------> D > > > > If we want to track the bounds after adding two registers on with umin = A & > > umax = B, the other with umin = C and umin = D > > > > <----------> A > > |----------********************--| > > <-----------------------------> B > > + > > <--------------------> C > > |--------------------*********---| > > <----------------------------> D > > > > The results falls into the following u65 range > > > > |--------------------*********---|-------------------------------| > > + |----------********************--|-------------------------------| > > > > |------------------------------**|**************************-----| > > > > This result can be tracked with base + size approach just fine. Where the > > base and size are as follow > > > > <------------------------------> base = A + C > > |------------------------------**|**************************-----| > > <---------------------------> > > size = (B - A) + (D - C) > > > > --- > > > > Now back to the topic of unification of signed and unsigned range. Using the > > union of bound #1 and bound #2 again as an example (size = B - A, and > > base = (u64) A + 2^63) > > > > |**************------------******| union of bound #1 and bound #2 > > > > And look at it's wrapped number line form again > > > > u64 same u64 wrapped > > <--------------------------> base > > |**************------------******|*************------------******| > > <------------------> size > > > > Now add in the s64 range and align both u64 range and s64 at 0, we can see > > what previously was a bound that umin/umax cannot track is simply a valid > > smin/smax bound (idea drawn from patch [2]). > > > > 0 > > |**************------------******|*************------------******| > > |----------********************--| > > s64 > > > > The question now is be what is the "signed" base so we proceed to calculate > > the smin/smax. Note that base starts at 0, so for s64 the line that > > represents base doesn't start from the left-most location. > > (OTOH size stays the same, so we know it already) > > > > s64 > > 0 > > <-----> signed base = ? > > |----------********************--| > > <------------------> size is the same > > > > If we put u64 range back into the picture again, we can see that the "signed > > base" was, in fact, just base casted into s64, so there's really no need for > > a "signed" base at all > > > > <--------------------------> base > > |**************------------******| > > 0 > > <-----> signed base = (s64) base > > |----------********************--| > > > > Which shows base + size approach capture signed and unsigned bounds at the > > same time. Or at least its the best attempt I can make to show it. > > > > One way to look at this is that base + size is just a generalization of > > umin/umax, taking advantage of the fact that the similar underlying hardware > > is used both for the execution of BPF program and bound tracking. > > > > I wonder whether this is already being done elsewhere, e.g. by PREVAIL or > > some of static code analyzer, and I can just borrow the code from there > > (where license permits). > > As per [1], PREVAIL uses the zone domain[2][3] to track values along with > relationships between values, where as the Linux Kernel tracks values with > min/max (i.e. interval domain) and tnum. In short, PREVAIL does not use this > trick, but I guess it probably don't need to since it's already using > something that considered to be more advanced. tnum is actually critical for checking memory alignment (i.e., checking that low 2-3 bits are always zero), which range tracking can't do. So I suspect PREVAIL doesn't validate those conditions, while kernel's verifier does. > > Also, found some research papers on this topic referring to it as Wrapped > Intervals[4] or Modular Interval Domain[5]. The former has an MIT-licensed > C++ implementation[6] available as reference. > > 1: https://pchaigno.github.io/ebpf/2023/09/06/prevail-understanding-the-windows-ebpf-verifier.html > 2: https://en.wikipedia.org/wiki/Difference_bound_matrix#Zone > 3: https://github.com/vbpf/ebpf-verifier/blob/6d5ad53/src/crab/split_dbm.hpp > 4: https://dl.acm.org/doi/abs/10.1145/2651360 this one (judging by the name of the paper) looks exactly like what we are trying to do here. I'll give it a read some time later. Meanwhile I posted full patch set with range logic ([0]), feel free to take a look as well. [0] https://patchwork.kernel.org/project/netdevbpf/list/?series=797178&state=* > 5: https://hal.science/hal-00748094/document > 6: https://github.com/caballa/wrapped-intervals/blob/master/lib/RangeAnalysis/WrappedRange.cpp > > > The glaring questions left to address are: > > 1. Lots of talk with no code at all: > > Will try to work on this early November and send some result as RFC. In > > the meantime if someone is willing to give it a try I'll do my best to > > help. > > > > 2. Whether the same trick applied to scalar_min_max_add() can be applied to > > other arithmetic operations such as BPF_MUL or BPF_DIV: > > Maybe not, but we should be able to keep on using most of the existing > > bound inferring logic we have scalar_min_max_{mul,div}() since base + > > size can be viewed as a generalization of umin/umax/smin/smax. > > > > 3. (Assuming this base + size approach works) how to integrate it into our > > existing codebase: > > I think we may need to refactor out code that touches > > umin/umax/smin/smax and provide set-operation API where possible. (i.e. > > like tnum's APIs) > > > > 4. Whether the verifier loss to ability to track certain range that comes > > out of mixed u64 and s64 BPF operations, and this loss cause some BPF > > program that passes the verfier to now be rejected. > > > > 5. Probably more that I haven't think of, feel free to add or comments :) > > > > > > Shung-Hsi > > > > 1: https://pchaigno.github.io/ebpf/2023/09/06/prevail-understanding-the-windows-ebpf-verifier.html > > 2: https://lore.kernel.org/bpf/20231022205743.72352-2-andrii@xxxxxxxxxx/ > > 3: https://lore.kernel.org/bpf/20231022205743.72352-4-andrii@xxxxxxxxxx/