[+Cc linux-crypto] Please use reply-all so that the list gets included. On Mon, Apr 08, 2024 at 04:15:32PM +0200, Stefan Kanthak wrote: > Hi Eric, > > > On Mon, Apr 08, 2024 at 11:26:52AM +0200, Stefan Kanthak wrote: > >> Use shorter SSE2 instructions instead of some SSE4.1 > >> use short displacements into K256 > >> > >> --- -/arch/x86/crypto/sha256_ni_asm.S > >> +++ +/arch/x86/crypto/sha256_ni_asm.S > > > > Thanks! I'd like to benchmark this to see how it affects performance, > > Performance is NOT affected: if CPUs weren't superscalar, the patch might > save 2 to 4 processor cycles as it replaces palignr/pblendw (slow) with > punpck*qdq (fast and shorter) > > > but unfortunately this patch doesn't apply. It looks your email client > > corrupted your patch by replacing tabs with spaces. Can you please use > > 'git send-email' to send patches? > > I don't use git at all; I'll use cURL instead. > Since the information on vger.kernel.org states "text/plain", no multipart, > I assume that attachments are also not accepted? Please read Documentation/process/submitting-patches.rst, which explains how to submit Linux kernel patches. > >> + pshufd $0xB1, STATE0, STATE0 /* HGFE */ > >> + pshufd $0x1B, STATE1, STATE1 /* DCBA */ > >> > >> movdqu STATE0, 0*16(DIGEST_PTR) > >> movdqu STATE1, 1*16(DIGEST_PTR) > > > > Please make sure to run the crypto self-tests too. > > I can't, I don't use Linux at all; I just noticed that this function uses > 4-byte displacements and palignr/pblendw instead of punpck?qdq after pshufd > > > The above is storing the two halves of the state in the wrong order. > > ARGH, you are right; I recognized it too, wanted to correct it, but was > interrupted and forgot it after returning to the patch. Sorry. I'm afraid that if you don't submit a probably formatted and tested patch, your patch can't be accepted. We can treat it as a suggestion, though since you're sending actual code it would really help if it had your Signed-off-by. - Eric