Re: [PATCH bpf-next] bpf: tnums: Provably sound, faster, and more precise algorithm for tnum_mul

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks for reviewing this patch, Andrii. All of your comments make sense to us.
We will resend the patch with the fixes you requested.


On Sun, May 30, 2021 at 1:59 AM Andrii Nakryiko
<andrii.nakryiko@xxxxxxxxx> wrote:
>
> On Thu, May 27, 2021 at 11:14 PM <hv90@xxxxxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > From: Harishankar Vishwanathan <harishankar.vishwanathan@xxxxxxxxxxx>
> >
> > This patch introduces a new algorithm for multiplication of tristate
> > numbers (tnums) that is provably sound. It is faster and more precise when
> > compared to the existing method.
> >
> > Like the existing method, this new algorithm follows the long
> > multiplication algorithm. The idea is to generate partial products by
> > multiplying each bit in the multiplier (tnum a) with the multiplicand
> > (tnum b), and adding the partial products after appropriately bit-shifting
> > them. The new algorithm, however, uses just a single loop over the bits of
> > the multiplier (tnum a) and accumulates only the uncertain components of
> > the multiplicand (tnum b) into a mask-only tnum. The following paper
> > explains the algorithm in more detail: https://arxiv.org/abs/2105.05398.
>
> This is a nice paper, I appreciated tables with algorithms pseudo-code
> and specific examples with uncertain bits, thanks!
>
> I think your algorithm makes sense, but I've also CC'ed original
> author of tnum logic. Edward, please take a look as well.
>
> See below mostly styling nits.
>
> >
> > A natural way to construct the tnum product is by performing a tnum
> > addition on all the partial products. This algorithm presents another
> > method of doing this: decompose each partial product into two tnums,
> > consisting of the values and the masks separately. The mask-sum is
> > accumulated within the loop in acc_m. The value-sum tnum is generated
> > using a.value * b.value. The tnum constructed by tnum addition of the
> > value-sum and the mask-sum contains all possible summations of concrete
> > values drawn from the partial product tnums pairwise. We prove this result
> > in the paper.
> >
> > Our evaluations show that the new algorithm is overall more precise
> > (producing tnums with less uncertain components) than the existing method.
> > As an illustrative example, consider the input tnums A and B. The numbers
> > in the paranthesis correspond to (value;mask).
> >
> > A                = 000000x1 (1;2)
> > B                = 0010011x (38;1)
> > A * B (existing) = xxxxxxxx (0;255)
> > A * B (new)      = 0x1xxxxx (32;95)
> >
> > Importantly, we present a proof of soundness of the new algorithm in the
> > aforementioned paper. Additionally, we show that this new algorithm is
> > empirically faster than the existing method.
> >
> > Co-developed-by: Matan Shachnai <m.shachnai@xxxxxxxxxxx>
> > Signed-off-by: Matan Shachnai <m.shachnai@xxxxxxxxxxx>
> > Co-developed-by: Srinivas Narayana <srinivas.narayana@xxxxxxxxxxx>
> > Signed-off-by: Srinivas Narayana <srinivas.narayana@xxxxxxxxxxx>
> > Co-developed-by: Santosh Nagarakatte <santosh.nagarakatte@xxxxxxxxxxx>
> > Signed-off-by: Santosh Nagarakatte <santosh.nagarakatte@xxxxxxxxxxx>
> > Signed-off-by: Harishankar Vishwanathan <harishankar.vishwanathan@xxxxxxxxxxx>
> > ---
> >  kernel/bpf/tnum.c | 38 +++++++++++++++++++-------------------
> >  1 file changed, 19 insertions(+), 19 deletions(-)
> >
> > diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c
> > index ceac5281bd31..bb1fa1cc181d 100644
> > --- a/kernel/bpf/tnum.c
> > +++ b/kernel/bpf/tnum.c
> > @@ -111,30 +111,30 @@ struct tnum tnum_xor(struct tnum a, struct tnum b)
> >         return TNUM(v & ~mu, mu);
> >  }
> >
> > -/* half-multiply add: acc += (unknown * mask * value).
> > - * An intermediate step in the multiply algorithm.
> > - */
> > -static struct tnum hma(struct tnum acc, u64 value, u64 mask)
> > +struct tnum tnum_mul(struct tnum a, struct tnum b)
>
> It's probably a good idea to have a short description (from your
> commit description above) of the algorithm in the comment above this
> function, with a link to your paper.
>
> >  {
> > -       while (mask) {
> > -               if (mask & 1)
> > -                       acc = tnum_add(acc, TNUM(0, value));
> > -               mask >>= 1;
> > -               value <<= 1;
> > -       }
> > -       return acc;
> > -}
> > +       u64 acc_v = a.value * b.value;
> > +       struct tnum acc_m = TNUM(0, 0);
> >
> > -struct tnum tnum_mul(struct tnum a, struct tnum b)
> > -{
> > -       struct tnum acc;
> > -       u64 pi;
> > +       while (a.value > 0 || a.mask > 0) {
>
> `while (a.value || a.mask)` is shorter and doesn't imply that a.value
> or a.mask can be < 0 (otherwise you'd write != 0, right? ;)
>
> > +
>
> unnecessary empty line
>
> > +               // LSB of tnum a is a certain 1
>
> please use C-style comments /* */
>
> > +               if (((a.value & 1) == 1) && ((a.mask & 1) == 0))
>
> just if (a.value & 1) is enough. if a.value == 1, a.mask has to be 0,
> right? and (x & 1) == 1 is just a more verbose way of saying (x & 1)
> is non-zero, which in C is just if (x & 1).
>
> > +                       acc_m = tnum_add(acc_m, TNUM(0, b.mask));
> >
> > -       pi = a.value * b.value;
> > -       acc = hma(TNUM(pi, 0), a.mask, b.mask | b.value);
> > -       return hma(acc, b.mask, a.value);
> > +               // LSB of tnum a is uncertain
> > +               else if ((a.mask & 1) == 1)
>
> same about comment style and simpler if statement; also another comment below
>
> > +                       acc_m = tnum_add(acc_m, TNUM(0, b.value | b.mask));
> > +
> > +               // Note: no case for LSB is certain 0
> > +               a = tnum_rshift(a, 1);
> > +               b = tnum_lshift(b, 1);
> > +       }
> > +
> > +       return tnum_add(TNUM(acc_v, 0), acc_m);
> >  }
> >
> > +
>
> another unnecessary empty line
>
> >  /* Note that if a and b disagree - i.e. one has a 'known 1' where the other has
> >   * a 'known 0' - this will return a 'known 1' for that bit.
> >   */
> > --
> > 2.25.1
> >



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux