[PATCH v3 0/6] x86 CRC optimizations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This patchset applies to the crc tree and is also available at:

    git fetch https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git crc-x86-v3

This series replaces the existing x86 PCLMULQDQ optimized CRC code with
new code that is shared among the different CRC variants and also adds
VPCLMULQDQ support, greatly improving performance on recent CPUs.  The
last patch wires up the same optimization to crc64_be() and crc64_nvme()
(a.k.a. the old "crc64_rocksoft") which previously were unoptimized,
improving the performance of those CRC functions by as much as 100x.
crc64_be is used by bcachefs, and crc64_nvme is used by blk-integrity.

Changed in v3:
- It's back to just the x86 patches now, since I've applied the CRC64
  library rework patches.
- Added review and ack tags.
- Made more improvements to crc-pclmul-template.S and gen-crc-consts.py,
  such as improving the comments that explain some of the steps,
  tweaking the exact choice of constants in certain cases where more
  than one is equivalent, sharing a bit more of the source code between
  lsb and msb-first CRCs, and eliminating an unnecessary instruction.

Changed in v2:
- Rebased onto upstream
- Added CRC64 library rework patches
- Capitalized YMM and ZMM
- Moved gen-crc-consts.py from scripts/crc/ to just scripts/
- Renamed crc-pclmul-template-glue.h to just crc-pclmul-template.h
- The asm functions that use longer vectors no longer tail-call the ones
  that use shorter vectors in order to handle short lengths.  Each
  function now handles all lengths >= 16 bytes directly.
- Made various other improvements to crc-pclmul-template.S and
  gen-crc-consts.py
- It's 2025 now; updated the copyright statements
- Improved commit messages
- Added ack tags

Eric Biggers (6):
  x86: move ZMM exclusion list into CPU feature flag
  scripts/gen-crc-consts: add gen-crc-consts.py
  x86/crc: add "template" for [V]PCLMULQDQ based CRC functions
  x86/crc32: implement crc32_le using new template
  x86/crc-t10dif: implement crc_t10dif using new template
  x86/crc64: implement crc64_be and crc64_nvme using new template

 MAINTAINERS                         |   1 +
 arch/x86/Kconfig                    |   3 +-
 arch/x86/crypto/aesni-intel_glue.c  |  22 +-
 arch/x86/include/asm/cpufeatures.h  |   1 +
 arch/x86/kernel/cpu/intel.c         |  22 ++
 arch/x86/lib/Makefile               |   5 +-
 arch/x86/lib/crc-pclmul-consts.h    | 195 ++++++++++
 arch/x86/lib/crc-pclmul-template.S  | 584 ++++++++++++++++++++++++++++
 arch/x86/lib/crc-pclmul-template.h  |  81 ++++
 arch/x86/lib/crc-t10dif-glue.c      |  23 +-
 arch/x86/lib/crc16-msb-pclmul.S     |   6 +
 arch/x86/lib/crc32-glue.c           |  37 +-
 arch/x86/lib/crc32-pclmul.S         | 219 +----------
 arch/x86/lib/crc64-glue.c           |  50 +++
 arch/x86/lib/crc64-pclmul.S         |   7 +
 arch/x86/lib/crct10dif-pcl-asm_64.S | 332 ----------------
 scripts/gen-crc-consts.py           | 239 ++++++++++++
 17 files changed, 1214 insertions(+), 613 deletions(-)
 create mode 100644 arch/x86/lib/crc-pclmul-consts.h
 create mode 100644 arch/x86/lib/crc-pclmul-template.S
 create mode 100644 arch/x86/lib/crc-pclmul-template.h
 create mode 100644 arch/x86/lib/crc16-msb-pclmul.S
 create mode 100644 arch/x86/lib/crc64-glue.c
 create mode 100644 arch/x86/lib/crc64-pclmul.S
 delete mode 100644 arch/x86/lib/crct10dif-pcl-asm_64.S
 create mode 100755 scripts/gen-crc-consts.py


base-commit: 5b793bbee96c666ca14db8409509abd73a3e0130
-- 
2.48.1





[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]
  Powered by Linux