Re: [RFC bpf-next v0 0/7] Unifying signed and unsigned min/max tracking

Shung-Hsi Yu <shung-hsi.yu@xxxxxxxx> · Sat, 11 Nov 2023 11:45:22 +0800

On Thu, Nov 09, 2023 at 11:38:09AM -0800, Alexei Starovoitov wrote:
> On Tue, Nov 7, 2023 at 9:46 PM Shung-Hsi Yu <shung-hsi.yu@xxxxxxxx> wrote:
> > This patchset is a proof of concept for previously discussed concept[1]
> > of unifying smin/smax with umin/uax for both the 32-bit and 64-bit
> > range[2] within bpf_reg_state. This will allow the verifier to track
> > both signed and unsiged ranges using just two u32 (for 32-bit range) or
> > u64 (for 64-bit range), instead four as we currently do.
> >
> > The size reduction we gain from this probably isn't very significant.
> > The main benefit is in lowering of code _complexity_. The verifier
> > currently employs 5 value tracking mechanisms, two for 64-bit range, two
> > for 32-bit range, and one tnum that tracks individual bits. Exchanging
> > knowledge between all the bound tracking mechanisms requires a
> > (theoretical) 20-way synchronization[3]; with signed and unsigned
> > unification the verifier will only need 3 value tracking mechanism,
> > cutting this down to a 6-way synchronization.
> >
> > The unification is possible from a theoretical standpoint[4] and there
> > exists implementation[5]. The challenge lies in implementing it inside
> > the verifier and making sure it fits well with all the logic we have in
> > place.
> >
> > To lower the difficulty, the unified min/max tracking is implemented in
> > isolation, and have it correctness checked using model checking. The
> > model checking code can be found in this patchset as well, but is not
> > meant to be merged since a better method already exists[6].
> >
> > So far I've managed to implement add/sub/mul operations for unified
> > min/max tracking, the next steps are:
> > - implement operation that can be used gain knowledge from conditional
> >   jump, e.g wrange32_intersect, wrange32_diff
> > - implement wrange32_from_min_max and wrange32_to_min_max so we can
> >   check whether this PoC works using current selftests
> > - implement operations for wrange64, the 64-bit counterpart of wrange32
> > - come up with how to exchange knowledge between wrange64 and wrange32
> >   (this is likely the most difficult part)
> > - think about how to integrate this work in a manageable manner
> 
> Thanks for taking a stab at it.
> The biggest question is how to integrate it without breaking anything.
> I suspect you might need to implement all alu and branch logic
> just to be able to run the tests.

Once the wrange32_{to,from}_min_max() helpers in patch 7 is implemented, I
should be able to swap out individual alu operation while keeping
bpf_reg_state untouched. E.g. for addition in 32-bit

  static void scalar32_min_max_add(struct bpf_reg_state *dst_reg,
                                   struct bpf_reg_state *src_reg)
  {
      struct wrange32 a = wrange32_from_min_max(dst_reg->smin, dst_reg->smax,
                                                dst_reg->umin, dst_reg->umax);
      struct wrange32 b = wrange32_from_min_max(src_reg->smin, src_reg->smax,
                                                src_reg->umin, src_reg->umax);

      wrange32_to_min_max(wrange32_add(a, b), &dst_reg->smin, &dst_reg->smax,
                          &dst_reg->umin, &dst_reg->umax);
  }

and get current tests to run on top of the new algorithm. This won't cover
every aspect, but should be enough as a first taste on how well (or unwell)
the integration will be.

These helpers also can help to create finer intermediate steps for smoother
integration; something that's added in the beginning to aid the transition,
but removed after the transition is done.

> It's difficult to see a path for partial/incremental addition.
> The concern is that at the end this approach might hit an issue
> which will make it infeasible.

Agree. While the helpers above can aid with integration, I do not see a safe
path for partial addition. At least not before everything until
reg_bound_sync() proofs to work should it be considered.
Still a long way to go.

> So it's a big bet. Might be nice correctness and memory saving or nothing.
> Certainly exciting, but proceed with caution.

Having enough optimism and attachment to tackle this but not too much to the
point of overlooking its flaws is certainly a challenging task.
Will try my best :)

Thanks for the feedbacks!