On Mon, 28 Jan 2019 at 09:57, Eric Biggers <ebiggers@xxxxxxxxxx> wrote: > > The x86, arm, and arm64 asm implementations of crct10dif are very > difficult to understand partly because many of the comments, labels, and > macros are named incorrectly: the lengths mentioned are usually off by a > factor of two from the actual code. Many other things are unnecessarily > convoluted as well, e.g. there are many more fold constants than > actually needed and some aren't fully reduced. > > This series therefore cleans up all these implementations to be much > more maintainable. I also made some small optimizations where I saw > opportunities, resulting in slightly better performance. > Wonderful, thanks for doing this Eric. When I ported the code from x86, I noticed that the comments were out of sync with the code, but I wasn't aware of the other issues at all. Assuming the new code passes your extended tests, Acked-by: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> for the series, but with one minor nit for the ARM version: could we please move the literal pool to .rodata as we did for the arm64 version? > This is based on top of the pending patches from Ard Biesheuvel. > > Eric Biggers (3): > crypto: x86/crct10dif-pcl - cleanup and optimizations > crypto: arm/crct10dif-ce - cleanup and optimizations > crypto: arm64/crct10dif-ce - cleanup and optimizations > > arch/arm/crypto/crct10dif-ce-core.S | 546 +++++++-------- > arch/arm/crypto/crct10dif-ce-glue.c | 2 +- > arch/arm64/crypto/crct10dif-ce-core.S | 496 +++++++------- > arch/arm64/crypto/crct10dif-ce-glue.c | 4 +- > arch/x86/crypto/crct10dif-pcl-asm_64.S | 848 +++++++++--------------- > arch/x86/crypto/crct10dif-pclmul_glue.c | 3 +- > 6 files changed, 794 insertions(+), 1105 deletions(-) > > -- > 2.20.1 >