[PATCH v4 0/3] crypto: crct10dif assembly cleanup and optimizations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The x86, arm, and arm64 asm implementations of crct10dif are very
difficult to understand partly because many of the comments, labels, and
macros are named incorrectly: the lengths mentioned are usually off by a
factor of two from the actual code.  Many other things are unnecessarily
convoluted as well, e.g. there are many more fold constants than
actually needed and some aren't fully reduced.

This series therefore cleans up all these implementations to be much
more maintainable.  I also made some small optimizations where I saw
opportunities, resulting in slightly better performance.

This is based on top of the pending patches from Ard Biesheuvel.

These all pass the new extra self-tests.

Changed since v3:
- Added '.arch armv7-a' to arm32 assembly file to fix a build error.
- Removed support for len < 16 from the x86 assembly.

Changed since v2:
- Removed the unnecessary '__LINUX_ARM_ARCH__ < 7' case.
- Added Ard's Acked-by.

Changed since v1:
- Moved constants in arm implementation to .rodata.
- Eliminated a few instructions from the x86 implementation.
- Tweaked a few comments.

Eric Biggers (3):
  crypto: x86/crct10dif-pcl - cleanup and optimizations
  crypto: arm/crct10dif-ce - cleanup and optimizations
  crypto: arm64/crct10dif-ce - cleanup and optimizations

 arch/arm/crypto/crct10dif-ce-core.S     | 553 ++++++++---------
 arch/arm/crypto/crct10dif-ce-glue.c     |   2 +-
 arch/arm64/crypto/crct10dif-ce-core.S   | 496 +++++++--------
 arch/arm64/crypto/crct10dif-ce-glue.c   |   4 +-
 arch/x86/crypto/crct10dif-pcl-asm_64.S  | 782 +++++++-----------------
 arch/x86/crypto/crct10dif-pclmul_glue.c |  12 +-
 6 files changed, 729 insertions(+), 1120 deletions(-)

-- 
2.20.1




[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]

  Powered by Linux