On Fri, Oct 19, 2018 at 01:41:35PM +0800, Ard Biesheuvel wrote: > On 18 October 2018 at 12:37, Eric Biggers <ebiggers@xxxxxxxxxx> wrote: > > From: Eric Biggers <ebiggers@xxxxxxxxxx> > > > > Make the ARM scalar AES implementation closer to constant-time by > > disabling interrupts and prefetching the tables into L1 cache. This is > > feasible because due to ARM's "free" rotations, the main tables are only > > 1024 bytes instead of the usual 4096 used by most AES implementations. > > > > On ARM Cortex-A7, the speed loss is only about 5%. The resulting code > > is still over twice as fast as aes_ti.c. Responsiveness is potentially > > a concern, but interrupts are only disabled for a single AES block. > > > > So that would be in the order of 700 cycles, based on the numbers you > shared in v1 of the aes_ti.c patch. Does that sound about right? So > that would be around 1 microsecond, which is really not a number to > obsess about imo. > Correct, on ARM Cortex-A7 I'm seeing slightly over 700 cycles per block encrypted or decrypted, including the prefetching. - Eric