Re: [PATCH] crypto: x86/aesni - implement accelerated CBCMAC, CMAC and XCBC shashes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 3 Aug 2020 at 21:11, Ben Greear <greearb@xxxxxxxxxxxxxxx> wrote:
>
> Hello,
>
> This helps a bit...now download sw-crypt performance is about 150Mbps,
> but still not as good as with my patch on 5.4 kernel, and fpu is still
> high in perf top:
>
>     13.89%  libc-2.29.so   [.] __memset_sse2_unaligned_erms
>       6.62%  [kernel]       [k] kernel_fpu_begin
>       4.14%  [kernel]       [k] _aesni_enc1
>       2.06%  [kernel]       [k] __crypto_xor
>       1.95%  [kernel]       [k] copy_user_generic_string
>       1.93%  libjvm.so      [.] SpinPause
>       1.01%  [kernel]       [k] aesni_encrypt
>       0.98%  [kernel]       [k] crypto_ctr_crypt
>       0.93%  [kernel]       [k] udp_sendmsg
>       0.78%  [kernel]       [k] crypto_inc
>       0.74%  [kernel]       [k] __ip_append_data.isra.53
>       0.65%  [kernel]       [k] aesni_cbc_enc
>       0.64%  [kernel]       [k] __dev_queue_xmit
>       0.62%  [kernel]       [k] ipt_do_table
>       0.62%  [kernel]       [k] igb_xmit_frame_ring
>       0.59%  [kernel]       [k] ip_route_output_key_hash_rcu
>       0.57%  [kernel]       [k] memcpy
>       0.57%  libjvm.so      [.] InstanceKlass::oop_follow_contents
>       0.56%  [kernel]       [k] irq_fpu_usable
>       0.56%  [kernel]       [k] mac_do_update
>
> If you'd like help setting up a test rig and have an ath10k pcie NIC or ath9k pcie NIC,
> then I can help.  Possibly hwsim would also be a good test case, but I have not tried
> that.
>

I don't think this is likely to be reproducible on other
micro-architectures, so setting up a test rig is unlikely to help.

I'll send out a v2 which implements a ahash instead of a shash (and
implements some other tweaks) so that kernel_fpu_begin() is only
called twice for each packet on the cbcmac path.

Do you have any numbers for the old kernel without your patch? This
pathological FPU preserve/restore behavior could be caused be the
optimizations, or by other changes that landed in the meantime, so I
would like to know if kernel_fpu_begin() is as prominent in those
traces as well.



[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]

  Powered by Linux