The x86, arm, and arm64 asm implementations of crct10dif are very difficult to understand partly because many of the comments, labels, and macros are named incorrectly: the lengths mentioned are usually off by a factor of two from the actual code. Many other things are unnecessarily convoluted as well, e.g. there are many more fold constants than actually needed and some aren't fully reduced. This series therefore cleans up all these implementations to be much more maintainable. I also made some small optimizations where I saw opportunities, resulting in slightly better performance. This is based on top of the pending patches from Ard Biesheuvel. These all pass the new extra self-tests. Changed since v3: - Added '.arch armv7-a' to arm32 assembly file to fix a build error. - Removed support for len < 16 from the x86 assembly. Changed since v2: - Removed the unnecessary '__LINUX_ARM_ARCH__ < 7' case. - Added Ard's Acked-by. Changed since v1: - Moved constants in arm implementation to .rodata. - Eliminated a few instructions from the x86 implementation. - Tweaked a few comments. Eric Biggers (3): crypto: x86/crct10dif-pcl - cleanup and optimizations crypto: arm/crct10dif-ce - cleanup and optimizations crypto: arm64/crct10dif-ce - cleanup and optimizations arch/arm/crypto/crct10dif-ce-core.S | 553 ++++++++--------- arch/arm/crypto/crct10dif-ce-glue.c | 2 +- arch/arm64/crypto/crct10dif-ce-core.S | 496 +++++++-------- arch/arm64/crypto/crct10dif-ce-glue.c | 4 +- arch/x86/crypto/crct10dif-pcl-asm_64.S | 782 +++++++----------------- arch/x86/crypto/crct10dif-pclmul_glue.c | 12 +- 6 files changed, 729 insertions(+), 1120 deletions(-) -- 2.20.1