Re: [PATCH v8 12/12] crypto: x86/aes-kl - Implement the AES-XTS algorithm

"Chang S. Bae" <chang.seok.bae@xxxxxxxxx> · Wed, 7 Jun 2023 15:06:48 -0700

On 6/6/2023 10:35 PM, Eric Biggers wrote:
On Sat, Jun 03, 2023 at 08:22:27AM -0700, Chang S. Bae wrote:

Can you also mention why you are doing this?  I suppose it might as well be
done, but I'm not seeing how it would actually matter.

While this crypto implementation is in the kernel mode, userspace can 
call it:
    https://docs.kernel.org/crypto/userspace-if.html

And those AES instructions are executable in userspace.

Say someone takes a key handle out of the kernel code and then decrypts 
some disk image from userspace. At least, this is enforced not to do.

What other sorts of key usage restrictions does AES-KL support?  Are any other
ones useful here?

Besides this, there are additional bits to restrict using encryption and 
decryption respectively.

This can be found in Section 1.1.1.1 'Handle Restrictions' in its 
whitepaper:

https://www.intel.com/content/www/us/en/develop/download/intel-key-locker-specification.html

Subsequently the key handle could be corrupted or fail with handle
restrictions. Then, encrypt()/decrypt() returns -EINVAL.

Aren't these scenarios actually impossible?  At least without memory corruption.

Yes, in the dm-crypt path, I think. But, the key handle can be tainted 
in the userspace -> API path.

I think this may help users as this feature can do some integrity checks 
at first and then populate an error right away if it goes wrong.
Thus, advertise it with a unique name 'xts-aes-aeskl' in /proc/crypto while
not replacing AES-NI under the generic name 'xts(aes)' with a lower priority.

The above sentence seems to say that xts-aes-aeskl does *not* have a lower
priority than xts-aes-aesni.  But actually it does.

No, it does not say that. This needs to call out the latter part more 
clearly.

Then, the performance is unlikely better than 64-bit which has already a gap
vs. AES-NI.

I don't understand what this sentence is trying to say.

This is in another section for explaining why 64-bitness only. I kinda 
added another point to avoid 32-bit code. But, anyways it is known that 
32-bit kernel mode is being deprecated. Then, the 128-bit register story 
seems to be enough there.

+config AS_HAS_KEYLOCKER
+	def_bool $(as-instr,encodekey256 %eax$(comma)%eax)
+	help
+	  Supported by binutils >= 2.36 and LLVM integrated assembler >= V12

It looks like arch/x86/Kconfig.assembler would be a better place for this.

Yeah, the commit 5e8ebd841a44 ("x86: probe assembler capabilities via 
kconfig instead of makefile") moved those over there.

+
+#define IN1	%xmm8
+#define IN	IN1

Why do both IN1 and IN exist?  Shouldn't there just be IN?

Oh, this is a silly leftover from the CBC code as it has multiple inputs.

#define IN %xmm8 then, s/IN1/IN/g

+
+#define AREG	%rax

Shouldn't %rax just be hardcoded?

I thought this (or any other) renaming helps to read. Maybe I'm missing 
something. Can I get to know your thought on this?

+#define HANDLEP	%rdi

This should be called CTX, to match the function prototypes.

+#define UKEYP	OUTP

This should be called IN_KEY, to match the function prototypes.

Okay. But, OTOH, the prototype itself is somewhat generic. Then its 
argument naming does not always match with what is supposed to be meant 
in the code. Thus, AES-NI renamed those like

    ctx    -> KEYP
    in_key -> UKEY
    ...

So, another option can be leaving some comments there, e.g. '# ctx is 
renamed to KEYP'.

+
+.Lsetkey_end:
+	movdqu STATE1, (HANDLEP)
+	movdqu STATE2, 0x10(HANDLEP)
+	movdqu STATE3, 0x20(HANDLEP)

The moves to the ctx should use movdqa, since it is aligned.

Reading the manual, the difference is whether generating #GP or not when 
any misaligned memory operand comes. Then, MOVDQA all here seems to be 
saying please check the alignment every time.

But, HANDLEP is known to have an aligned address. Then, the plain move 
seems to be enough and coherent with the glue code -- avoid unnecessary 
sanity checks.

+
+	xor AREG, AREG
+	FRAME_END
+	RET
+SYM_FUNC_END(__aeskl_setkey)

This function always returns 0, so it really should return void.

Yeah, fair enough.

In the common case (successful AES-256 encryption) this is executing 'jmp'
twice.  I think the code should be rearranged to eliminate these jmps.

Ah, right. I think a good point! Let me tweak this for those most likely 
cases.

__aeskl_xts_encrypt() and __aeskl_xts_decrypt() are very similar.  To reduce
code duplication, can you consider generating them from a macro that takes an
argument that indicates whether it is encrypt or decrypt?

Yeah, I can see the code that prepares operands is common between them. 
But, I'm not sure folding them together can make it more readable.

Something that your AES-KL code does that's a bit ugly is that it abuses
'struct crypto_aes_ctx' to store a Keylocker key handle instead
of the actual AES key schedule which the struct is supposed to be for.

The proper way to represent that would be to make the tfm context for
xts-aes-aeskl be a union of crypto_aes_ctx and a Keylocker specific context.

Agreed. I think this is likely the fallout of that struct aesni_xts_ctx 
fix. Previously, the field was a byte array which itself is not 
necessarily representing the extended-key format. Now the fix changed it 
to be more specific. Accordingly, Key Locker has to specify it.

Thanks,
Chang