On Sun, Dec 06, 2020 at 11:45:23PM +0100, Ard Biesheuvel wrote: > Follow the same approach as the arm64 driver for implementing a version > of AES-NI in CBC mode that supports ciphertext stealing. Compared to the > generic CTS template wrapped around the existing cbc-aes-aesni skcipher, > this results in a ~2x speed increase for relatively short inputs (less > than 256 bytes), which is relevant given that AES-CBC with ciphertext > stealing is used for filename encryption in the fscrypt layer. For larger > inputs, the speedup is still significant (~25% on decryption, ~6% on > encryption). > > Signed-off-by: Ard Biesheuvel <ardb@xxxxxxxxxx> > --- > Full tcrypt benchmark results for cts(cbc-aes-aesni) vs cts-cbc-aes-aesni > after the diff (Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz) > > arch/x86/crypto/aesni-intel_asm.S | 87 +++++++++++++ > arch/x86/crypto/aesni-intel_glue.c | 133 ++++++++++++++++++++ > 2 files changed, 220 insertions(+) This is passing the self-tests (including the extra tests), and it's definitely faster, and would be useful for fscrypt. I did my own benchmarks and got Encryption: Message size Before (MB/s) After (MB/s) ------------ ------------- ------------ 32 136.83 273.04 64 230.03 262.04 128 372.92 487.71 256 541.41 652.95 Decryption: Message size Before (MB/s) After (MB/s) ------------ ------------- ------------ 32 121.95 280.04 64 208.72 279.72 128 397.98 635.79 256 723.09 1105.05 (This was with "Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz") So feel free to add: Tested-by: Eric Biggers <ebiggers@xxxxxxxxxx> I might not have time to fully review this, but one comment below: > +static int cts_cbc_encrypt(struct skcipher_request *req) > +{ [...] > +static int cts_cbc_decrypt(struct skcipher_request *req) > +{ [...] > #ifdef CONFIG_X86_64 > + }, { > + .base = { > + .cra_name = "__cts(cbc(aes))", > + .cra_driver_name = "__cts-cbc-aes-aesni", > + .cra_priority = 400, > + .cra_flags = CRYPTO_ALG_INTERNAL, > + .cra_blocksize = AES_BLOCK_SIZE, > + .cra_ctxsize = CRYPTO_AES_CTX_SIZE, > + .cra_module = THIS_MODULE, > + }, > + .min_keysize = AES_MIN_KEY_SIZE, > + .max_keysize = AES_MAX_KEY_SIZE, > + .ivsize = AES_BLOCK_SIZE, > + .walksize = 2 * AES_BLOCK_SIZE, > + .setkey = aesni_skcipher_setkey, > + .encrypt = cts_cbc_encrypt, > + .decrypt = cts_cbc_decrypt, The algorithm is conditional on CONFIG_X86_64, but the function definitions aren't. It needs to be one way or the other, otherwise there will be a compiler warning on 32-bit builds. - Eric