Thanks for reviewing this patch, Andrii. All of your comments make sense to us. We will resend the patch with the fixes you requested. On Sun, May 30, 2021 at 1:59 AM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > On Thu, May 27, 2021 at 11:14 PM <hv90@xxxxxxxxxxxxxxxxxxxxxxx> wrote: > > > > From: Harishankar Vishwanathan <harishankar.vishwanathan@xxxxxxxxxxx> > > > > This patch introduces a new algorithm for multiplication of tristate > > numbers (tnums) that is provably sound. It is faster and more precise when > > compared to the existing method. > > > > Like the existing method, this new algorithm follows the long > > multiplication algorithm. The idea is to generate partial products by > > multiplying each bit in the multiplier (tnum a) with the multiplicand > > (tnum b), and adding the partial products after appropriately bit-shifting > > them. The new algorithm, however, uses just a single loop over the bits of > > the multiplier (tnum a) and accumulates only the uncertain components of > > the multiplicand (tnum b) into a mask-only tnum. The following paper > > explains the algorithm in more detail: https://arxiv.org/abs/2105.05398. > > This is a nice paper, I appreciated tables with algorithms pseudo-code > and specific examples with uncertain bits, thanks! > > I think your algorithm makes sense, but I've also CC'ed original > author of tnum logic. Edward, please take a look as well. > > See below mostly styling nits. > > > > > A natural way to construct the tnum product is by performing a tnum > > addition on all the partial products. This algorithm presents another > > method of doing this: decompose each partial product into two tnums, > > consisting of the values and the masks separately. The mask-sum is > > accumulated within the loop in acc_m. The value-sum tnum is generated > > using a.value * b.value. The tnum constructed by tnum addition of the > > value-sum and the mask-sum contains all possible summations of concrete > > values drawn from the partial product tnums pairwise. We prove this result > > in the paper. > > > > Our evaluations show that the new algorithm is overall more precise > > (producing tnums with less uncertain components) than the existing method. > > As an illustrative example, consider the input tnums A and B. The numbers > > in the paranthesis correspond to (value;mask). > > > > A = 000000x1 (1;2) > > B = 0010011x (38;1) > > A * B (existing) = xxxxxxxx (0;255) > > A * B (new) = 0x1xxxxx (32;95) > > > > Importantly, we present a proof of soundness of the new algorithm in the > > aforementioned paper. Additionally, we show that this new algorithm is > > empirically faster than the existing method. > > > > Co-developed-by: Matan Shachnai <m.shachnai@xxxxxxxxxxx> > > Signed-off-by: Matan Shachnai <m.shachnai@xxxxxxxxxxx> > > Co-developed-by: Srinivas Narayana <srinivas.narayana@xxxxxxxxxxx> > > Signed-off-by: Srinivas Narayana <srinivas.narayana@xxxxxxxxxxx> > > Co-developed-by: Santosh Nagarakatte <santosh.nagarakatte@xxxxxxxxxxx> > > Signed-off-by: Santosh Nagarakatte <santosh.nagarakatte@xxxxxxxxxxx> > > Signed-off-by: Harishankar Vishwanathan <harishankar.vishwanathan@xxxxxxxxxxx> > > --- > > kernel/bpf/tnum.c | 38 +++++++++++++++++++------------------- > > 1 file changed, 19 insertions(+), 19 deletions(-) > > > > diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c > > index ceac5281bd31..bb1fa1cc181d 100644 > > --- a/kernel/bpf/tnum.c > > +++ b/kernel/bpf/tnum.c > > @@ -111,30 +111,30 @@ struct tnum tnum_xor(struct tnum a, struct tnum b) > > return TNUM(v & ~mu, mu); > > } > > > > -/* half-multiply add: acc += (unknown * mask * value). > > - * An intermediate step in the multiply algorithm. > > - */ > > -static struct tnum hma(struct tnum acc, u64 value, u64 mask) > > +struct tnum tnum_mul(struct tnum a, struct tnum b) > > It's probably a good idea to have a short description (from your > commit description above) of the algorithm in the comment above this > function, with a link to your paper. > > > { > > - while (mask) { > > - if (mask & 1) > > - acc = tnum_add(acc, TNUM(0, value)); > > - mask >>= 1; > > - value <<= 1; > > - } > > - return acc; > > -} > > + u64 acc_v = a.value * b.value; > > + struct tnum acc_m = TNUM(0, 0); > > > > -struct tnum tnum_mul(struct tnum a, struct tnum b) > > -{ > > - struct tnum acc; > > - u64 pi; > > + while (a.value > 0 || a.mask > 0) { > > `while (a.value || a.mask)` is shorter and doesn't imply that a.value > or a.mask can be < 0 (otherwise you'd write != 0, right? ;) > > > + > > unnecessary empty line > > > + // LSB of tnum a is a certain 1 > > please use C-style comments /* */ > > > + if (((a.value & 1) == 1) && ((a.mask & 1) == 0)) > > just if (a.value & 1) is enough. if a.value == 1, a.mask has to be 0, > right? and (x & 1) == 1 is just a more verbose way of saying (x & 1) > is non-zero, which in C is just if (x & 1). > > > + acc_m = tnum_add(acc_m, TNUM(0, b.mask)); > > > > - pi = a.value * b.value; > > - acc = hma(TNUM(pi, 0), a.mask, b.mask | b.value); > > - return hma(acc, b.mask, a.value); > > + // LSB of tnum a is uncertain > > + else if ((a.mask & 1) == 1) > > same about comment style and simpler if statement; also another comment below > > > + acc_m = tnum_add(acc_m, TNUM(0, b.value | b.mask)); > > + > > + // Note: no case for LSB is certain 0 > > + a = tnum_rshift(a, 1); > > + b = tnum_lshift(b, 1); > > + } > > + > > + return tnum_add(TNUM(acc_v, 0), acc_m); > > } > > > > + > > another unnecessary empty line > > > /* Note that if a and b disagree - i.e. one has a 'known 1' where the other has > > * a 'known 0' - this will return a 'known 1' for that bit. > > */ > > -- > > 2.25.1 > >