On Tue, 9 Apr 2024 at 14:11, Eric Biggers <ebiggers@xxxxxxxxxx> wrote: > > On Tue, Apr 09, 2024 at 11:12:11AM +0200, Ard Biesheuvel wrote: > > On Tue, 9 Apr 2024 at 02:02, Eric Biggers <ebiggers@xxxxxxxxxx> wrote: > > > > > > From: Eric Biggers <ebiggers@xxxxxxxxxx> > > > > > > Access the AES round keys using offsets -7*16 through 7*16, instead of > > > 0*16 through 14*16. This allows VEX-encoded instructions to address all > > > round keys using 1-byte offsets, whereas before some needed 4-byte > > > offsets. This decreases the code size of aes-xts-avx-x86_64.o by 4.2%. > > > > > > Signed-off-by: Eric Biggers <ebiggers@xxxxxxxxxx> > > > > Nice optimization! > > > > Do you think we might be able to macrofy this a bit so we can use zero > > based indexing for the round keys, and hide the arithmetic? > > > > > > There are two alternatives I considered: defining variables KEYOFF0 through > KEYOFF14 and writing the offsets as KEYOFF\i(KEY), or defining one variable > KEYOFF and writing the offsets as \i*16-KEYOFF(KEY). I think I slightly prefer > the current patch where it's less abstracted out, though. It makes it clear the > offsets really are single-byte, and also index 7 is the exact mid-point so going > from -7 to 7 still feels fairly natural. If we wanted to do something more > complex like use different offsets for AVX vs. AVX512, then we'd need the > abstraction to handle that, but it doesn't seem useful to do that. > Fair enough.