Multibuffer hashing was a constant sore while it was part of the kernel. It was very buggy and unnecessarily complex. Finally it was removed when it had been broken for a while without anyone noticing. Peace reigned in its absence, until Eric Biggers made a proposal for its comeback :) Link: https://lore.kernel.org/all/20240415213719.120673-1-ebiggers@xxxxxxxxxx/ The issue is that the SHA algorithm (and possibly others) is inherently not parallelisable. Therefore the only way to exploit parallelism on modern CPUs is to hash multiple indendent streams of data. Eric's proposal is a simple interface bolted onto shash that takes two streams of data of identical length. I thought the limitation of two was too small, and Eric addressed that in his latest version: Link: https://lore.kernel.org/all/20241001153718.111665-2-ebiggers@xxxxxxxxxx/ However, I still disliked the addition of this to shash as it meant that users would have to spend extra effort in order to accumulate and maintain multiple streams of data. My preference is to use ahash as the basis of multibuffer, because its request object interface is perfectly suited to chaining. The ahash interface is almost universally hated because of its use of the SG list. So to sweeten the deal I have added virtual address support to ahash, thus rendering the shash interface redundant. Note that ahash can already be used synchronously by asking for sync-only algorithms. Thus there is no need to handle callbacks and such *if* you don't want to. This patch-set introduces two additions to the ahash interface. First of all request chaining is added so that an arbitrary number of requests can be submitted in one go. Incidentally this also reduces the cost of indirect calls by amortisation. It then adds virtual address support to ahash. This allows the user to supply a virtual address as the input instead of an SG list. This is assumed to be not DMA-capable so it is always copied before it's passed to an existing ahash driver. New drivers can elect to take virtual addresses directly. Of course existing shash algorithms are able to take virtual addresses without any copying. The final patch resurrects the old SHA2 AVX2 muiltibuffer code as a proof of concept that this API works. The result shows that with a full complement of 8 requests, this API is able to achieve parity with the more modern but single-threaded SHA-NI code. Herbert Xu (6): crypto: ahash - Only save callback and data in ahash_save_req crypto: hash - Add request chaining API crypto: tcrypt - Restore multibuffer ahash tests crypto: ahash - Add virtual address support crypto: ahash - Set default reqsize from ahash_alg crypto: x86/sha2 - Restore multibuffer AVX2 support arch/x86/crypto/Makefile | 2 +- arch/x86/crypto/sha256_mb_mgr_datastruct.S | 304 +++++++++++ arch/x86/crypto/sha256_ssse3_glue.c | 523 ++++++++++++++++-- arch/x86/crypto/sha256_x8_avx2.S | 596 +++++++++++++++++++++ crypto/ahash.c | 566 ++++++++++++++++--- crypto/tcrypt.c | 227 ++++++++ include/crypto/algapi.h | 10 + include/crypto/hash.h | 68 ++- include/crypto/internal/hash.h | 17 +- include/linux/crypto.h | 26 + 10 files changed, 2209 insertions(+), 130 deletions(-) create mode 100644 arch/x86/crypto/sha256_mb_mgr_datastruct.S create mode 100644 arch/x86/crypto/sha256_x8_avx2.S -- 2.39.5