Hi Herbert, On Sat, Dec 14, 2019 at 9:56 AM Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> wrote: > > Eric Biggers <ebiggers@xxxxxxxxxx> wrote: > > > > Now, it's possible that the performance gain outweighs this, and I too would > > like to have the C implementation of Poly1305 be faster. So if you'd like to > > argue for the performance gain, fine, and if there's a significant performance > > gain I don't have an objection. But I'm not sure why you're at the same time > > trying to argue that *adding* an extra implementation somehow makes the code > > easier to audit and doesn't add complexity... > > Right. We need the numbers not because we're somehow attached > to the existing code, but we need them to show that we should > carry the burden of having two C implementations, 32-bit vs 64-bit. This info is now in the commit message of the version in my tree, rather than sprinkled around casually in these threads. I also did a bit more benchmarking this morning. >From <https://git.zx2c4.com/linux-dev/commit/?h=jd/crypto-5.5&id=900b79e1ff48f1f294ef3e9fb2520699c8895860>: > Testing with kbench9000, depending on the CPU, the update function for > the 32x32 version has been improved by 4%-7%, and for the 64x64 by > 19%-30%. The 32x32 gains are small, but I think there's great value in > having a parallel implementation to the 64x64 one so that the two can be > compared side-by-side as nice stand-alone units. I'll resubmit this on Monday. Regards, Jason