On Tue, 7 Dec 2021 21:29:07 -0800 Jakub Kicinski wrote: > On Wed, 8 Dec 2021 15:40:37 +1100 Herbert Xu wrote: > > On Tue, Dec 07, 2021 at 11:32:52AM -0800, Jakub Kicinski wrote: > > > Hi! > > > > > > The x86 AES crypto (gcm(aes)) requires 16B alignment which is hard to > > > achieve in networking. Is there any reason for this? On any moderately > > > recent Intel platform aligned and unaligned vmovdq should have the same > > > performance (reportedly). > > > > There is no such thing as an alignment requirement. If an algorithm > > specifies an alignment and you pass it a request which is unaligned, > > the Crypto API will automatically align the data for you. > > > > So what is the actual problem here? > > By align you mean copy right? I'm trying to avoid the copy. Hm, I'm benchmarking things now and it appears to be a regression introduced somewhere around 5.11 / 5.12. I don't see the memcpy eating 20% of performance on 5.10. Bisection time.