Hi Eric, 2017-03-30 21:55 GMT+02:00 Eric Biggers <ebiggers3@xxxxxxxxx>: > This is an improvement; I'm just thinking that maybe this should be done for all > the gf128mul_x_*() functions, if only so that they use a consistent style and > are all defined next to each other. Right, that doesn't seem to be a bad idea... I was confused for a while by the '& 0xff' in the _lle one, but now I see it also uses just two values of the table, so it can be re-written in a similar way. In fact, the OCB mode from RFC 7253 (that I'm currently trying to port to kernel crypto API) uses gf128mul_x_bbe, so it would be useful to have that one accessible, too. I will move them all in v2, then. > Also note that '(b & ((u64)1 << 63)) ? 0x87 : 0x00;' is actually getting > compiled as '((s64)b >> 63) & 0x87', which is branchless and therefore makes the > new version more efficient than one might expect: > > sar $0x3f,%rax > and $0x87,%eax > > It could even be written the branchless way explicitly, but it shouldn't matter. I think the definition using unsigned operations is more intuitive... Let's just leave the clever tricks up to the compiler :) Thanks, O.M. > > - Eric