This patch series provide an accelerated/optimized Chacha20 and Poly1305 implementation for Power10 or later CPU (ppc64le). This module implements algorithm specified in RFC7539. The implementation provides 3.5X better performance than the baseline for Chacha20 and Poly1305 individually and 1.5X improvement for Chacha20/Poly1305 operation. This patch has been tested with the kernel crypto module tcrypt.ko and has passed the selftest. The patch is also tested with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS enabled. Danny Tsen (5): An optimized Chacha20 implementation with 8-way unrolling for ppc64le. Glue code for optmized Chacha20 implementation for ppc64le. An optimized Poly1305 implementation with 4-way unrolling for ppc64le. Glue code for optmized Poly1305 implementation for ppc64le. Update Kconfig and Makefile. arch/powerpc/crypto/Kconfig | 26 + arch/powerpc/crypto/Makefile | 4 + arch/powerpc/crypto/chacha-p10-glue.c | 223 +++++ arch/powerpc/crypto/chacha-p10le-8x.S | 842 ++++++++++++++++++ arch/powerpc/crypto/poly1305-p10-glue.c | 186 ++++ arch/powerpc/crypto/poly1305-p10le_64.S | 1075 +++++++++++++++++++++++ 6 files changed, 2356 insertions(+) create mode 100644 arch/powerpc/crypto/chacha-p10-glue.c create mode 100644 arch/powerpc/crypto/chacha-p10le-8x.S create mode 100644 arch/powerpc/crypto/poly1305-p10-glue.c create mode 100644 arch/powerpc/crypto/poly1305-p10le_64.S -- 2.31.1