Re: RPCSEC GSS krb5 KUnit test fails on arm64 with h/w accelerated ciphers enabled

Ard Biesheuvel <ardb@xxxxxxxxxx> · Fri, 28 Apr 2023 17:09:47 +0100

On Fri, 28 Apr 2023 at 13:59, Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
>
>
>
> > On Apr 28, 2023, at 5:57 AM, Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
> >
> > On Fri, 28 Apr 2023 at 10:44, Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> wrote:
> >>
> >> Scott Mayhew <smayhew@xxxxxxxxxx> wrote:
> >>>
> >>> diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S
> >>> index 0e834a2c062c..477605fad76b 100644
> >>> --- a/arch/arm64/crypto/aes-modes.S
> >>> +++ b/arch/arm64/crypto/aes-modes.S
> >>> @@ -268,6 +268,7 @@ AES_FUNC_START(aes_cbc_cts_encrypt)
> >>>       add             x4, x0, x4
> >>>       st1             {v0.16b}, [x4]                  /* overlapping stores */
> >>>       st1             {v1.16b}, [x0]
> >>> +       st1             {v1.16b}, [x5]
> >>>       ret
> >>> AES_FUNC_END(aes_cbc_cts_encrypt)
> >>>
> >>> But I don't know if that change is at all correct! (I've never even
> >>> looked at arm64 asm before).  If someone who's knowledgeable about this
> >>> code could chime in, I'd appreciate it.
> >>
> >> Ard, could you please take a look at this?
> >>
> >
> > The issue seems to be that the caller expects iv_out to have been
> > populated even when doing ciphertext stealing.
> >
> > This is meaningless, because the output IV can only be used to chain
> > additional encrypted data if the split is at a block boundary. The
> > ciphertext stealing fundamentally terminates the encryption, and
> > produces a block of ciphertext that is shorter than the block size, so
> > what the output IV should be is actually unspecified.
> >
> > IOW, test cases having plain/ciphertext vectors whose size is not a
> > multiple of the cipher block size should not specify an expected value
> > for the output IV.
>
> The test cases are extracted from RFC 3962 Appendix B. The
> standard clearly expects there to be a non-zero next IV for
> plaintext sizes that are not block-aligned.
>

OK, so this is the Kerberos V specific spec on how to use AES in CBC
mode, which appears to describe how to chain multiple CBC encryptions
together.

CBC-CTS itself does not define this: the IV is an input only, and the
reason we happen to return the IV is because a single CBC operation
may be broken up into several ones, which can only be done on block
boundaries. This is why CBC-CTS itself passes all its tests: a single
CBC-CTS encryption only performs ciphertext stealing at the very end,
and the next IV is never used in that case. (This is why the CTS mode
tests in crypto/testmgr.h don't have iv_out vectors)

Note that RFC3962 defines that the penultimate block of CBC-CTS
ciphertext is used as the next IV. CBC returns the last ciphertext
block as the output IV. It is a happy coincidence that the generic CTS
template encapsulates CBC in a way where its output IV ends up in the
right place.

> Also, these test cases pass on other hardware platforms.
>

Fair enough.

I am not opposed to fixing this, but given that it is the RFC3962 spec
that defines that the next IV is the penultimate full block of the
previous CBC-CTS ciphertext, we might consider doing the memcpy() in
the Kerberos code not in the CBC-CTS implementations. (The 32-bit ARM
code will be broken in the same way, and potentially other
implementations as well)