Eric Biggers <ebiggers@xxxxxxxxxx> wrote: > This patchset adds new AES-XTS implementations that accelerate disk and > file encryption on modern x86_64 CPUs. > > The largest improvements are seen on CPUs that support the VAES > extension: Intel Ice Lake (2019) and later, and AMD Zen 3 (2020) and > later. However, an implementation using plain AESNI + AVX is also added > and provides a small boost on older CPUs too. > > To try to handle the mess that is x86 SIMD, the code for all the new > AES-XTS implementations is generated from an assembly macro. This makes > it so that we e.g. don't have to have entirely different source code > just for different vector lengths (xmm, ymm, zmm). > > To avoid downclocking effects, zmm registers aren't used on certain > Intel CPU models such as Ice Lake. These CPU models default to an > implementation using ymm registers instead. > > This patchset increases the throughput of AES-256-XTS decryption by the > following amounts on the following CPUs: > > | 4096-byte messages | 512-byte messages | > ----------------------+--------------------+-------------------+ > Intel Skylake | 1% | 11% | > Intel Ice Lake | 92% | 59% | > Intel Sapphire Rapids | 115% | 78% | > AMD Zen 1 | 25% | 20% | > AMD Zen 2 | 26% | 20% | > AMD Zen 3 | 82% | 40% | > AMD Zen 4 | 118% | 48% | > > (The results for encryption are very similar to decryption. I just tend > to measure decryption because decryption performance is more important.) > > There's no separate kconfig option for the new AES-XTS implementations, > as they are included in the existing option CONFIG_CRYPTO_AES_NI_INTEL. > > To make testing easier, all four new AES-XTS implementations are > registered separately with the crypto API. They are prioritized > appropriately so that the best one for the CPU is used by default. > > Open questions: > > - Is the policy that I implemented for preferring ymm registers to zmm > registers the right one? arch/x86/crypto/poly1305_glue.c thinks that > only Skylake has the bad downclocking. My current proposal is a bit > more conservative; it also excludes Ice Lake and Tiger Lake. Those > CPUs supposedly still have some downclocking, though not as much. > > - Should the policy on the use of zmm registers be in a centralized > place? It probably doesn't make sense to have random different > policies for different crypto algorithms (AES, Poly1305, ARIA, etc.). > > - Are there any other known issues with using AVX512 in kernel mode? It > seems to work, and technically it's not new because Poly1305 and ARIA > already use AVX512, including the mask registers and zmm registers up > to 31. So if there was a major issue, like the new registers not > being properly saved and restored, it probably would have already been > found. But AES-XTS support would introduce a wider use of it. > > Eric Biggers (6): > x86: add kconfig symbols for assembler VAES and VPCLMULQDQ support > crypto: x86/aes-xts - add AES-XTS assembly macro for modern CPUs > crypto: x86/aes-xts - wire up AESNI + AVX implementation > crypto: x86/aes-xts - wire up VAES + AVX2 implementation > crypto: x86/aes-xts - wire up VAES + AVX10/256 implementation > crypto: x86/aes-xts - wire up VAES + AVX10/512 implementation > > arch/x86/Kconfig.assembler | 10 + > arch/x86/crypto/Makefile | 3 +- > arch/x86/crypto/aes-xts-avx-x86_64.S | 796 +++++++++++++++++++++++++++ > arch/x86/crypto/aesni-intel_glue.c | 263 ++++++++- > 4 files changed, 1070 insertions(+), 2 deletions(-) > create mode 100644 arch/x86/crypto/aes-xts-avx-x86_64.S > > > base-commit: 4cece764965020c22cff7665b18a012006359095 All applied. Thanks. -- Email: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt