Hi Martin, On Thu, Dec 12, 2019 at 4:34 PM Martin Willi <martin@xxxxxxxxxxxxxx> wrote: > As the author of the removed code, I'm certainly biased, so I won't > hinder the adaption of the new code. Thanks. > * It removes the existing SSE2 code path. Most likely not that much of > an issue due to the new AVX variant. It's not clear that that sse2 code is even faster than the x86_64 scalar code in the new implementation, actually. Either way, regardless of that, in spite of the previous sentence, I don't think it really matters, based on the chips we care about targeting. > * I certainly would favor gradual improvement, and I think the code > would allow it. But as said, not my pick. You saw this code well over a year ago and seemed okay with it at the time. Meanwhile you were inspired to fix your ChaCha implementation to narrow the gap, but no progress with your Poly1305 one. And I'd like to avoid adding a NEW implementation to audit for bugs and vulnerabilities and stuff. On the contrary, this code here is in widespread use and has been highly scrutinized. So please, don't waste time doing such a thing. I'd nack it on the grounds of it being an unnecessary risk. > * Those 4000+ lines perl/asm are a lot Ard just added the same for the new Poly1305 implementations on ARM, ARM64, MIPS, and MIPS64. This is code that's seen the most possible eyeballs of code in this category. And now we're finally converging on a complete set for that, with x86_64 being the last holdout. Please don't hinder its adoption. Your old code is slow and hasn't received much scrutiny. This new code is fast and has received a lot of scrutiny. Jason