Re: [PATCH] crypto: x86/aria-avx - fix using avx2 instructions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/11/23 06:18, Samuel Neves wrote:

Hi Samuel,
Thank you so much for the review!

> On Fri, Feb 10, 2023 at 6:18 PM Taehee Yoo <ap420073@xxxxxxxxx> wrote:
>>
>> Also, vpbroadcastd is simply replaced by vmovdqa in it.
>>
>>   #ifdef CONFIG_AS_GFNI
>>   #define aria_sbox_8way_gfni(x0, x1, x2, x3,            \
>>                              x4, x5, x6, x7,             \
>>                              t0, t1, t2, t3,             \
>>                              t4, t5, t6, t7)             \
>> -       vpbroadcastq .Ltf_s2_bitmatrix, t0;             \
>> -       vpbroadcastq .Ltf_inv_bitmatrix, t1;            \
>> -       vpbroadcastq .Ltf_id_bitmatrix, t2;             \
>> -       vpbroadcastq .Ltf_aff_bitmatrix, t3;            \
>> -       vpbroadcastq .Ltf_x2_bitmatrix, t4;             \
>> +       vmovdqa .Ltf_s2_bitmatrix, t0;                  \
>> +       vmovdqa .Ltf_inv_bitmatrix, t1;                 \
>> +       vmovdqa .Ltf_id_bitmatrix, t2;                  \
>> +       vmovdqa .Ltf_aff_bitmatrix, t3;                 \
>> +       vmovdqa .Ltf_x2_bitmatrix, t4;                  \
>
> You can use vmovddup to replicate the behavior of vpbroadcastq for xmm
> registers. It's as fast as a movdqa and does not require increasing
> the data fields to be 16 bytes.

Thanks for this suggestion!
I tested this driver using vmovddup instead of using vpbroadcastq, it works well.
As you mentioned, vmovddup doesn't require 16byte data.
So, I will use vmovddup instruction instead of vpbroadcastq instruction.

I will send the v2 patch for it.

Thank you so much,
Taehee Yoo



[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]
  Powered by Linux