Re: [PATCH v3 11/15] crypto: x86/aes-kl - Support AES algorithm using Key Locker instructions

"Bae, Chang Seok" <chang.seok.bae@xxxxxxxxx> · Mon, 6 Dec 2021 22:59:09 +0000

On Dec 6, 2021, at 14:14, Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
> On Tue, 30 Nov 2021 at 07:57, Bae, Chang Seok <chang.seok.bae@xxxxxxxxx> wrote:
>> 
>> 
>> No, these two instruction sets are separate. So I think no room to share the
>> ASM code.
> 
> On arm64, we have
> 
> aes-ce.S, which uses AES instructions to implement the AES core transforms
> 
> aes-neon.S, which uses plain NEON instructions to implement the AES
> core transforms
> 
> aes-modes.S, which can be combined with either of the above, and
> implements the various chaining modes (ECB, CBC, CTR, XTS, and a
> helper for CMAC, CBCMAC and XMAC)
> 
> If you have two different primitives for performing AES transforms
> (the original round by round one, and the KL one that does 10 or 14
> rounds at a time), you should still be able to reuse most of the code
> that implements the non-trivial handling of the chaining modes.

Yes, no question about this for maintainability.

However, besides the fact that a KL instruction takes multiple rounds, some
AES-KL instructions have register constraints. E.g. AESENCWIDE256KL always
uses XMM0-7 for input blocks.

Today, AES-NI code maintains 32-bit compatibility, e.g. clobbering XMM2-3 for
key and input vector, so sharing the code makes the AES-KL code inefficient
and even ugly I think due to the register constraint. E.g. the AES-KL code
does use XMM9-10 for key and an input vector, but it has to move them around
just for code sharing.

Thanks,
Chang