On Thu, May 27, 2021 at 11:14 PM <hv90@xxxxxxxxxxxxxxxxxxxxxxx> wrote: > > From: Harishankar Vishwanathan <harishankar.vishwanathan@xxxxxxxxxxx> > > This patch introduces a new algorithm for multiplication of tristate > numbers (tnums) that is provably sound. It is faster and more precise when > compared to the existing method. > > Like the existing method, this new algorithm follows the long > multiplication algorithm. The idea is to generate partial products by > multiplying each bit in the multiplier (tnum a) with the multiplicand > (tnum b), and adding the partial products after appropriately bit-shifting > them. The new algorithm, however, uses just a single loop over the bits of > the multiplier (tnum a) and accumulates only the uncertain components of > the multiplicand (tnum b) into a mask-only tnum. The following paper > explains the algorithm in more detail: https://arxiv.org/abs/2105.05398. This is a nice paper, I appreciated tables with algorithms pseudo-code and specific examples with uncertain bits, thanks! I think your algorithm makes sense, but I've also CC'ed original author of tnum logic. Edward, please take a look as well. See below mostly styling nits. > > A natural way to construct the tnum product is by performing a tnum > addition on all the partial products. This algorithm presents another > method of doing this: decompose each partial product into two tnums, > consisting of the values and the masks separately. The mask-sum is > accumulated within the loop in acc_m. The value-sum tnum is generated > using a.value * b.value. The tnum constructed by tnum addition of the > value-sum and the mask-sum contains all possible summations of concrete > values drawn from the partial product tnums pairwise. We prove this result > in the paper. > > Our evaluations show that the new algorithm is overall more precise > (producing tnums with less uncertain components) than the existing method. > As an illustrative example, consider the input tnums A and B. The numbers > in the paranthesis correspond to (value;mask). > > A = 000000x1 (1;2) > B = 0010011x (38;1) > A * B (existing) = xxxxxxxx (0;255) > A * B (new) = 0x1xxxxx (32;95) > > Importantly, we present a proof of soundness of the new algorithm in the > aforementioned paper. Additionally, we show that this new algorithm is > empirically faster than the existing method. > > Co-developed-by: Matan Shachnai <m.shachnai@xxxxxxxxxxx> > Signed-off-by: Matan Shachnai <m.shachnai@xxxxxxxxxxx> > Co-developed-by: Srinivas Narayana <srinivas.narayana@xxxxxxxxxxx> > Signed-off-by: Srinivas Narayana <srinivas.narayana@xxxxxxxxxxx> > Co-developed-by: Santosh Nagarakatte <santosh.nagarakatte@xxxxxxxxxxx> > Signed-off-by: Santosh Nagarakatte <santosh.nagarakatte@xxxxxxxxxxx> > Signed-off-by: Harishankar Vishwanathan <harishankar.vishwanathan@xxxxxxxxxxx> > --- > kernel/bpf/tnum.c | 38 +++++++++++++++++++------------------- > 1 file changed, 19 insertions(+), 19 deletions(-) > > diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c > index ceac5281bd31..bb1fa1cc181d 100644 > --- a/kernel/bpf/tnum.c > +++ b/kernel/bpf/tnum.c > @@ -111,30 +111,30 @@ struct tnum tnum_xor(struct tnum a, struct tnum b) > return TNUM(v & ~mu, mu); > } > > -/* half-multiply add: acc += (unknown * mask * value). > - * An intermediate step in the multiply algorithm. > - */ > -static struct tnum hma(struct tnum acc, u64 value, u64 mask) > +struct tnum tnum_mul(struct tnum a, struct tnum b) It's probably a good idea to have a short description (from your commit description above) of the algorithm in the comment above this function, with a link to your paper. > { > - while (mask) { > - if (mask & 1) > - acc = tnum_add(acc, TNUM(0, value)); > - mask >>= 1; > - value <<= 1; > - } > - return acc; > -} > + u64 acc_v = a.value * b.value; > + struct tnum acc_m = TNUM(0, 0); > > -struct tnum tnum_mul(struct tnum a, struct tnum b) > -{ > - struct tnum acc; > - u64 pi; > + while (a.value > 0 || a.mask > 0) { `while (a.value || a.mask)` is shorter and doesn't imply that a.value or a.mask can be < 0 (otherwise you'd write != 0, right? ;) > + unnecessary empty line > + // LSB of tnum a is a certain 1 please use C-style comments /* */ > + if (((a.value & 1) == 1) && ((a.mask & 1) == 0)) just if (a.value & 1) is enough. if a.value == 1, a.mask has to be 0, right? and (x & 1) == 1 is just a more verbose way of saying (x & 1) is non-zero, which in C is just if (x & 1). > + acc_m = tnum_add(acc_m, TNUM(0, b.mask)); > > - pi = a.value * b.value; > - acc = hma(TNUM(pi, 0), a.mask, b.mask | b.value); > - return hma(acc, b.mask, a.value); > + // LSB of tnum a is uncertain > + else if ((a.mask & 1) == 1) same about comment style and simpler if statement; also another comment below > + acc_m = tnum_add(acc_m, TNUM(0, b.value | b.mask)); > + > + // Note: no case for LSB is certain 0 > + a = tnum_rshift(a, 1); > + b = tnum_lshift(b, 1); > + } > + > + return tnum_add(TNUM(acc_v, 0), acc_m); > } > > + another unnecessary empty line > /* Note that if a and b disagree - i.e. one has a 'known 1' where the other has > * a 'known 0' - this will return a 'known 1' for that bit. > */ > -- > 2.25.1 >