[PATCH v2 00/10] Optimize SHA256 and SHA512 for Intel x86_64 with SSSE3, AVX or AVX2 instructions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Herbert, 

The following patch series provides optimized SHA256 and SHA512 routines
using the SSSE3, AVX or AVX2 instructions on x86_64 for Intel cpus.
Depending on cpu capabilities, speedup between 40% to 70% or more can be achieved
over the generic SHA256 and SHA512 routines.

Tim

Version 2:

1. Check AVX2 feature directly in glue code
2. Add CONFGI_AS_AVX2 check for AVX2 code
3. Use ENTRY/ENDPROC macros for assembly routines
4. Fix SSSE3 feature check in glue code

Thanks to Peter Anvin, Jussi Kivilinna and Jim Kukunas for their reviews and comments.

Tim Chen (10):
  Expose SHA256 generic routine to be callable externally.
  Optimized sha256 x86_64 assembly routine using Supplemental SSE3
    instructions.
  Optimized sha256 x86_64 assembly routine with AVX instructions.
  Optimized sha256 x86_64 routine using AVX2's RORX instructions
  Create module providing optimized SHA256 routines using SSSE3, AVX or
    AVX2 instructions.
  Expose generic sha512 routine to be callable from other modules
  Optimized SHA512 x86_64 assembly routine using Supplemental SSE3
    instructions.
  Optimized SHA512 x86_64 assembly routine using AVX instructions.
  Optimized SHA512 x86_64 assembly routine using AVX2 RORX instruction.
  Create module providing optimized SHA512 routines using SSSE3, AVX or
    AVX2 instructions.

 arch/x86/crypto/Makefile            |   4 +
 arch/x86/crypto/sha256-avx-asm.S    | 496 +++++++++++++++++++++++
 arch/x86/crypto/sha256-avx2-asm.S   | 772 ++++++++++++++++++++++++++++++++++++
 arch/x86/crypto/sha256-ssse3-asm.S  | 506 +++++++++++++++++++++++
 arch/x86/crypto/sha256_ssse3_glue.c | 275 +++++++++++++
 arch/x86/crypto/sha512-avx-asm.S    | 423 ++++++++++++++++++++
 arch/x86/crypto/sha512-avx2-asm.S   | 743 ++++++++++++++++++++++++++++++++++
 arch/x86/crypto/sha512-ssse3-asm.S  | 421 ++++++++++++++++++++
 arch/x86/crypto/sha512_ssse3_glue.c | 282 +++++++++++++
 crypto/Kconfig                      |  22 +
 crypto/sha256_generic.c             |  11 +-
 crypto/sha512_generic.c             |  13 +-
 include/crypto/sha.h                |   5 +
 13 files changed, 3962 insertions(+), 11 deletions(-)
 create mode 100644 arch/x86/crypto/sha256-avx-asm.S
 create mode 100644 arch/x86/crypto/sha256-avx2-asm.S
 create mode 100644 arch/x86/crypto/sha256-ssse3-asm.S
 create mode 100644 arch/x86/crypto/sha256_ssse3_glue.c
 create mode 100644 arch/x86/crypto/sha512-avx-asm.S
 create mode 100644 arch/x86/crypto/sha512-avx2-asm.S
 create mode 100644 arch/x86/crypto/sha512-ssse3-asm.S
 create mode 100644 arch/x86/crypto/sha512_ssse3_glue.c

-- 
1.7.11.7


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]

  Powered by Linux