Re: [PATCH 0/6] x86: new optimized CRC functions, with VPCLMULQDQ support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 25 Nov 2024 at 05:12, Eric Biggers <ebiggers@xxxxxxxxxx> wrote:
>
> This patchset is also available in git via:
>
>     git fetch https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git crc-x86-v1
>
> This patchset applies on top of my other recent CRC patchsets
> https://lore.kernel.org/r/20241103223154.136127-1-ebiggers@xxxxxxxxxx/ and
> https://lore.kernel.org/r/20241117002244.105200-1-ebiggers@xxxxxxxxxx/ .
> Consider it a preview for what may be coming next, as my priority is
> getting those two other patchsets merged first.
>
> This patchset adds a new assembly macro that expands into the body of a
> CRC function for x86 for the specified number of bits, bit order, vector
> length, and AVX level.  There's also a new script that generates the
> constants needed by this function, given a CRC generator polynomial.
>
> This approach allows easily wiring up an x86-optimized implementation of
> any variant of CRC-8, CRC-16, CRC-32, or CRC-64, including full support
> for VPCLMULQDQ.  On long messages the resulting functions are up to 4x
> faster than the existing PCLMULQDQ optimized functions when they exist,
> or up to 29x faster than the existing table-based functions.
>
> This patchset starts by wiring up the new macro for crc32_le,
> crc_t10dif, and crc32_be.  Later I'd also like to wire up crc64_be and
> crc64_rocksoft, once the design of the library functions for those has
> been fixed to be like what I'm doing for crc32* and crc_t10dif.
>
> A similar approach of sharing code between CRC variants, and vector
> lengths when applicable, should work for other architectures.  The CRC
> constant generation script should be mostly reusable.
>
> Eric Biggers (6):
>   x86: move zmm exclusion list into CPU feature flag
>   scripts/crc: add gen-crc-consts.py
>   x86/crc: add "template" for [V]PCLMULQDQ based CRC functions
>   x86/crc32: implement crc32_le using new template
>   x86/crc-t10dif: implement crc_t10dif using new template
>   x86/crc32: implement crc32_be using new template
>

Good stuff!

Acked-by: Ard Biesheuvel <ardb@xxxxxxxxxx>

Would indeed be nice to get CRC-64 implemented this way as well, so we
can use it on both x86 and arm64.




[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]
  Powered by Linux