On Tue, 10 Dec 2019 at 12:04, Keerthy <j-keerthy@xxxxxx> wrote: > > > > On 10/12/19 3:37 pm, Ard Biesheuvel wrote: > > On Tue, 10 Dec 2019 at 11:06, Keerthy <j-keerthy@xxxxxx> wrote: > >> > >> > >> > >> On 10/12/19 3:31 pm, Ard Biesheuvel wrote: > >>> Hello Keerthy, > >>> > >>> On Tue, 10 Dec 2019 at 10:35, Keerthy <j-keerthy@xxxxxx> wrote: > >>>> > >>>> Hi Ard, > >>>> > >>>> I am not sure if am the first one to report this. It seems like > >>>> aes_expandkey is giving me different expansion over what i get with the > >>>> older crypto_aes_expand_key which was removed with the below commit: > >>>> > >>>> commit 5bb12d7825adf0e80b849a273834f3131a6cc4e1 > >>>> Author: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> > >>>> Date: Tue Jul 2 21:41:33 2019 +0200 > >>>> > >>>> crypto: aes-generic - drop key expansion routine in favor of library > >>>> version > >>>> > >>>> The key that is being expanded is from the crypto aes(cbc) testsuite: > >>>> > >>>> }, { /* From NIST SP800-38A */ > >>>> .key = "\x8e\x73\xb0\xf7\xda\x0e\x64\x52" > >>>> "\xc8\x10\xf3\x2b\x80\x90\x79\xe5" > >>>> "\x62\xf8\xea\xd2\x52\x2c\x6b\x7b", > >>>> .klen = 24, > >>>> > >>>> > >>>> The older version crypto_aes_expand_key output that passes the cbc(aes) > >>>> decryption test: > > ... > >>>> > >>>> The difference is between 52nd index through 59. > >>>> > >>>> Any ideas if this is expected? > >>>> > >>> > >>> Yes, this is expected. This particular test vector uses a 192 bit key, > >>> so those values are DC/ignored. > >> > >> Thanks for the quick response. However with the new implementation > >> decryption test case fails for me with wrong result. > > > > Can you share more details please? Platform, endianness, etc .. > > Ard, > > I am trying to get aes working on a yet to be upstream TI HW crypto > Accelerator SA2UL. It is little endian. > > I had posted a series earlier this year: > > https://lkml.org/lkml/2019/6/28/20 > > The device expects the inverse key for decryption. > Could you elaborate? There is no such thing as an inverse *key*, only an inverse *key schedule* which is used for the Equivalent Inverse Cipher. AES-192 expands the 24 byte key to 13 round keys consisting of 4 32-bit words each, and so the algorithm does not actually use the contents of slots 52 and up in this case. > In the earlier working version i was copying the ctx.key_enc[48] to > ctx.key_enc[53] index of the ctx.key_enc array as the 24 bytes of > decryption key to my hardware. > > Now as told earlier the 52nd & 53rd words are changed and hence i end up > in wrong result. > > Fail: > > ctx.key_dec[48] = 0xf7b0738e & ctx.key_enc[48] = 0x6fa08be9 > ctx.key_dec[49] = 0x52640eda & ctx.key_enc[49] = 0x3c778c44 > ctx.key_dec[50] = 0x2bf310c8 & ctx.key_enc[50] = 0x472cc8e > ctx.key_dec[51] = 0xe5799080 & ctx.key_enc[51] = 0x2220001 > ctx.key_dec[52] = 0x13eaf950 & ctx.key_enc[52] = 0x13eaf850 > ctx.key_dec[53] = 0xffff8000 & ctx.key_enc[53] = 0xffff8000 > > Pass: > > ctx.key_dec[48] = 0xf7b0738e & ctx.key_enc[48] = 0x6fa08be9 > ctx.key_dec[49] = 0x52640eda & ctx.key_enc[49] = 0x3c778c44 > ctx.key_dec[50] = 0x2bf310c8 & ctx.key_enc[50] = 0x472cc8e > ctx.key_dec[51] = 0xe5799080 & ctx.key_enc[51] = 0x2220001 > ctx.key_dec[52] = 0x105127e8 & ctx.key_enc[52] = 0x68342d29 > ctx.key_dec[53] = 0xffff8000 & ctx.key_enc[53] = 0xddd31195 > The old code does the following for AES-192 #define loop6(i) do { \ t = ror32(t, 8); \ t = ls_box(t) ^ rco_tab[i]; \ t ^= ctx->key_enc[6 * i]; \ ctx->key_enc[6 * i + 6] = t; \ t ^= ctx->key_enc[6 * i + 1]; \ ctx->key_enc[6 * i + 7] = t; \ t ^= ctx->key_enc[6 * i + 2]; \ ctx->key_enc[6 * i + 8] = t; \ t ^= ctx->key_enc[6 * i + 3]; \ ctx->key_enc[6 * i + 9] = t; \ t ^= ctx->key_enc[6 * i + 4]; \ ctx->key_enc[6 * i + 10] = t; \ t ^= ctx->key_enc[6 * i + 5]; \ ctx->key_enc[6 * i + 11] = t; \ } while (0) case AES_KEYSIZE_192: ctx->key_enc[4] = get_unaligned_le32(in_key + 16); t = ctx->key_enc[5] = get_unaligned_le32(in_key + 20); for (i = 0; i < 8; ++i) loop6(i); break; so while it happens to populate slots 52 and 53 as well (when i == 7), the AES spec does not actually cover this, given that those values are not actually used in the computation (and I am at a loss understanding why it should make a difference in your case). In any case, you can work around this by calculating the missing values in your driver's expand_key() routine, ctx.key_enc[52] = ctx.key_enc[51] ^ ctx.key_enc[46]; ctx.key_enc[53] = ctx.key_enc[52] ^ ctx.key_enc[47];