Re: [PATCH 2/2] crypto: aegis/generic - fix for big endian systems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1 October 2018 at 09:50, Ondrej Mosnacek <omosnace@xxxxxxxxxx> wrote:
> Hi Ard,
>
> On Sun, Sep 30, 2018 at 10:59 AM Ard Biesheuvel
> <ard.biesheuvel@xxxxxxxxxx> wrote:
>> Use the correct __le32 annotation and accessors to perform the
>> single round of AES encryption performed inside the AEGIS transform.
>> Otherwise, tcrypt reports:
>>
>>   alg: aead: Test 1 failed on encryption for aegis128-generic
>>   00000000: 6c 25 25 4a 3c 10 1d 27 2b c1 d4 84 9a ef 7f 6e
>>   alg: aead: Test 1 failed on encryption for aegis128l-generic
>>   00000000: cd c6 e3 b8 a0 70 9d 8e c2 4f 6f fe 71 42 df 28
>>   alg: aead: Test 1 failed on encryption for aegis256-generic
>>   00000000: aa ed 07 b1 96 1d e9 e6 f2 ed b5 8e 1c 5f dc 1c
>
> Hm...  I think the reason I made a mistake here is that I first had a
> version with the AES table hard-coded and I had an #ifdef <is big
> endian> #else #endif there with values for little-endian and
> big-endian variants.  Then I realized the aes_generic module exports
> the crypto_ft_table and rewrote the code to use that.  Somewhere along
> the way I forgot to check if the aes_generic table uses the same trick
> and correct the code...
>
> It would be nice to apply the same optimization to aes_generic.c, but
> unfortunately the current tables are exported so changing the
> convention would break external modules that use them :/
>

Indeed. I am doing some refactoring work on the AES code, which is how
I ran into this in the first place.

https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=for-kernelci

>>
>> While at it, let's refer to the first precomputed table only, and
>> derive the other ones by rotation. This reduces the D-cache footprint
>> by 75%, and shouldn't be too costly or free on load/store architectures
>> (and X86 has its own AES-NI based implementation)
>
> Could you maybe extract this into a separate patch?  I don't think we
> should mix functional and performance fixes together.
>

Yeah, good point. I will do that and fold in the simplification.

>>
>> Fixes: f606a88e5823 ("crypto: aegis - Add generic AEGIS AEAD implementations")
>> Cc: <stable@xxxxxxxxxxxxxxx> # v4.18+
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx>
>> ---
>>  crypto/aegis.h | 23 +++++++++-----------
>>  1 file changed, 10 insertions(+), 13 deletions(-)
>>
>> diff --git a/crypto/aegis.h b/crypto/aegis.h
>> index f1c6900ddb80..84d3e07a3c33 100644
>> --- a/crypto/aegis.h
>> +++ b/crypto/aegis.h
>> @@ -21,7 +21,7 @@
>>
>>  union aegis_block {
>>         __le64 words64[AEGIS_BLOCK_SIZE / sizeof(__le64)];
>> -       u32 words32[AEGIS_BLOCK_SIZE / sizeof(u32)];
>> +       __le32 words32[AEGIS_BLOCK_SIZE / sizeof(__le32)];
>>         u8 bytes[AEGIS_BLOCK_SIZE];
>>  };
>>
>> @@ -59,22 +59,19 @@ static void crypto_aegis_aesenc(union aegis_block *dst,
>>  {
>>         u32 *d = dst->words32;
>>         const u8  *s  = src->bytes;
>> -       const u32 *k  = key->words32;
>> +       const __le32 *k  = key->words32;
>>         const u32 *t0 = crypto_ft_tab[0];
>> -       const u32 *t1 = crypto_ft_tab[1];
>> -       const u32 *t2 = crypto_ft_tab[2];
>> -       const u32 *t3 = crypto_ft_tab[3];
>>         u32 d0, d1, d2, d3;
>>
>> -       d0 = t0[s[ 0]] ^ t1[s[ 5]] ^ t2[s[10]] ^ t3[s[15]] ^ k[0];
>> -       d1 = t0[s[ 4]] ^ t1[s[ 9]] ^ t2[s[14]] ^ t3[s[ 3]] ^ k[1];
>> -       d2 = t0[s[ 8]] ^ t1[s[13]] ^ t2[s[ 2]] ^ t3[s[ 7]] ^ k[2];
>> -       d3 = t0[s[12]] ^ t1[s[ 1]] ^ t2[s[ 6]] ^ t3[s[11]] ^ k[3];
>> +       d0 = t0[s[ 0]] ^ rol32(t0[s[ 5]], 8) ^ rol32(t0[s[10]], 16) ^ rol32(t0[s[15]], 24);
>> +       d1 = t0[s[ 4]] ^ rol32(t0[s[ 9]], 8) ^ rol32(t0[s[14]], 16) ^ rol32(t0[s[ 3]], 24);
>> +       d2 = t0[s[ 8]] ^ rol32(t0[s[13]], 8) ^ rol32(t0[s[ 2]], 16) ^ rol32(t0[s[ 7]], 24);
>> +       d3 = t0[s[12]] ^ rol32(t0[s[ 1]], 8) ^ rol32(t0[s[ 6]], 16) ^ rol32(t0[s[11]], 24);
>>
>> -       d[0] = d0;
>> -       d[1] = d1;
>> -       d[2] = d2;
>> -       d[3] = d3;
>> +       d[0] = cpu_to_le32(d0 ^ le32_to_cpu(k[0]));
>> +       d[1] = cpu_to_le32(d1 ^ le32_to_cpu(k[1]));
>> +       d[2] = cpu_to_le32(d2 ^ le32_to_cpu(k[2]));
>> +       d[3] = cpu_to_le32(d3 ^ le32_to_cpu(k[3]));
>>  }
>>
>>  #endif /* _CRYPTO_AEGIS_H */
>> --
>> 2.19.0
>>
>
> Thanks,
>
> --
> Ondrej Mosnacek <omosnace at redhat dot com>
> Associate Software Engineer, Security Technologies
> Red Hat, Inc.



[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]

  Powered by Linux