On Tue, 29 Jan 2019 at 09:01, Eric Biggers <ebiggers@xxxxxxxxxx> wrote: > > The x86, arm, and arm64 asm implementations of crct10dif are very > difficult to understand partly because many of the comments, labels, and > macros are named incorrectly: the lengths mentioned are usually off by a > factor of two from the actual code. Many other things are unnecessarily > convoluted as well, e.g. there are many more fold constants than > actually needed and some aren't fully reduced. > > This series therefore cleans up all these implementations to be much > more maintainable. I also made some small optimizations where I saw > opportunities, resulting in slightly better performance. > > This is based on top of the pending patches from Ard Biesheuvel. As for v1: Acked-by: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> > > These all pass the new extra self-tests. > > Changed since v1: > - Moved constants in arm implementation to .rodata. One nit: the __adr macro is a bit pointless for v7/v8 ARM code, since it will always resolve to e movw/movt pair, but it doesn't harm either. > - Eliminated a few instructions from the x86 implementation. > - Tweaked a few comments. > > Eric Biggers (3): > crypto: x86/crct10dif-pcl - cleanup and optimizations > crypto: arm/crct10dif-ce - cleanup and optimizations > crypto: arm64/crct10dif-ce - cleanup and optimizations > > arch/arm/crypto/crct10dif-ce-core.S | 554 ++++++++-------- > arch/arm/crypto/crct10dif-ce-glue.c | 2 +- > arch/arm64/crypto/crct10dif-ce-core.S | 496 +++++++------- > arch/arm64/crypto/crct10dif-ce-glue.c | 4 +- > arch/x86/crypto/crct10dif-pcl-asm_64.S | 844 +++++++++--------------- > arch/x86/crypto/crct10dif-pclmul_glue.c | 3 +- > 6 files changed, 797 insertions(+), 1106 deletions(-) > > -- > 2.20.1 >