> -----Original Message----- > From: Taehee Yoo <ap420073@xxxxxxxxx> > Sent: Friday, August 26, 2022 12:32 AM > Subject: [PATCH v2 2/3] crypto: aria-avx: add AES-NI/AVX/x86_64 assembler > implementation of aria cipher > > v2: > - Do not call non-FPU functions(aria_{encrypt | decrypt}() in the > FPU context. > - Do not acquire FPU context for too long. ... > +static int ecb_do_encrypt(struct skcipher_request *req, const u32 *rkey) > +{ ... > + while ((nbytes = walk.nbytes) > 0) { > + const u8 *src = walk.src.virt.addr; > + u8 *dst = walk.dst.virt.addr; > + > + kernel_fpu_begin(); > + while (nbytes >= ARIA_AVX_BLOCK_SIZE) { > + aria_aesni_avx_crypt_16way(rkey, dst, src, ctx->rounds); > + dst += ARIA_AVX_BLOCK_SIZE; > + src += ARIA_AVX_BLOCK_SIZE; > + nbytes -= ARIA_AVX_BLOCK_SIZE; > + } > + kernel_fpu_end(); Per Herbert's reply on the sha512-avx RCU stall issue, another nesting level might be necessary limiting the amount of data processed between each kernel_fpu_begin() to kernel_fpu_end() pair to 4 KiB. If you modify this driver to use the ECB_WALK_START, ECB_BLOCK, and ECB_WALK_END macros from ecb_cbc_helpers.h and incorporate that fix, then your fix would be easy to replicate into the other users (camellia, cast5, cast6, serpent, and twofish).