The x86 assembly language implementations using SIMD process data between kernel_fpu_begin() and kernel_fpu_end() calls. That disables scheduler preemption, so prevents the CPU core from being used by other threads. Rather than break the processing into 4 KiB passes, each of which unilaterally calls kernel_fpu_begin() and kernel_fpu_end(), periodically check if the kernel scheduler wants to run something else on the CPU. If so, yield the kernel FPU context and let the scheduler intervene. Suggested-by: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> Signed-off-by: Robert Elliott <elliott@xxxxxxx> --- arch/x86/crypto/chacha_glue.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/arch/x86/crypto/chacha_glue.c b/arch/x86/crypto/chacha_glue.c index 7b3a1cf0984b..892cbae958b8 100644 --- a/arch/x86/crypto/chacha_glue.c +++ b/arch/x86/crypto/chacha_glue.c @@ -146,17 +146,21 @@ void chacha_crypt_arch(u32 *state, u8 *dst, const u8 *src, unsigned int bytes, bytes <= CHACHA_BLOCK_SIZE) return chacha_crypt_generic(state, dst, src, bytes, nrounds); - do { - unsigned int todo = min_t(unsigned int, bytes, SZ_4K); + kernel_fpu_begin(); + for (;;) { + const unsigned int chunk = min(bytes, 4096U); - kernel_fpu_begin(); - chacha_dosimd(state, dst, src, todo, nrounds); - kernel_fpu_end(); + chacha_dosimd(state, dst, src, chunk, nrounds); - bytes -= todo; - src += todo; - dst += todo; - } while (bytes); + bytes -= chunk; + if (!bytes) + break; + + src += chunk; + dst += chunk; + kernel_fpu_yield(); + } + kernel_fpu_end(); } EXPORT_SYMBOL(chacha_crypt_arch); -- 2.38.1