On Fri, 2024-04-05 at 20:26 -0400, Eric Biggers wrote: > From: Eric Biggers <ebiggers@xxxxxxxxxx> > > Since sha256_transform_rorx() uses ymm registers, execute vzeroupper > before returning from it. This is necessary to avoid reducing the > performance of SSE code. > > Fixes: d34a460092d8 ("crypto: sha256 - Optimized sha256 x86_64 routine using AVX2's RORX instructions") > Signed-off-by: Eric Biggers <ebiggers@xxxxxxxxxx> > --- > arch/x86/crypto/sha256-avx2-asm.S | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/arch/x86/crypto/sha256-avx2-asm.S b/arch/x86/crypto/sha256-avx2-asm.S > index 9918212faf91..0ffb072be956 100644 > --- a/arch/x86/crypto/sha256-avx2-asm.S > +++ b/arch/x86/crypto/sha256-avx2-asm.S > @@ -714,10 +714,11 @@ SYM_TYPED_FUNC_START(sha256_transform_rorx) > popq %r15 > popq %r14 > popq %r13 > popq %r12 > popq %rbx > + vzeroupper > RET > SYM_FUNC_END(sha256_transform_rorx) > > .section .rodata.cst512.K256, "aM", @progbits, 512 > .align 64 Acked-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>