On 7/29/20 1:06 PM, Ard Biesheuvel wrote:
On Wed, 29 Jul 2020 at 22:29, Ben Greear <greearb@xxxxxxxxxxxxxxx> wrote:
On 7/29/20 12:09 PM, Ard Biesheuvel wrote:
On Wed, 29 Jul 2020 at 15:27, Ben Greear <greearb@xxxxxxxxxxxxxxx> wrote:
On 7/28/20 11:06 PM, Ard Biesheuvel wrote:
On Wed, 29 Jul 2020 at 01:03, Ben Greear <greearb@xxxxxxxxxxxxxxx> wrote:
Hello,
As part of my wifi test tool, I need to do decrypt AES on the CPU, and the only way this
performs well is to use aesni. I've been using a patch for years that does this, but
recently somewhere between 5.4 and 5.7, the API I've been using has been removed.
Would anyone be interested in getting this support upstream? I'd be happy to pay for
the effort.
Here is the patch in question:
https://github.com/greearb/linux-ct-5.7/blob/master/wip/0001-crypto-aesni-add-ccm-aes-algorithm-implementation.patch
Please keep me in CC, I'm not subscribed to this list.
Hi Ben,
Recently, the x86 FPU handling was improved to remove the overhead of
preserving/restoring of the register state, so the issue that this
patch fixes may no longer exist. Did you try?
In any case, according to the commit log on that patch, the problem is
in the MAC generation, so it might be better to add a cbcmac(aes)
implementation only, and not duplicate all the CCM boilerplate.
Hello,
I don't know all of the details, and do not understand the crypto subsystem,
but I am pretty sure that I need at least some of this patch.
Whether this is true is what I am trying to get clarified.
Your patch works around a performance bottleneck related to the use of
AES-NI instructions in the kernel, which has been addressed recently.
If the issue still exists, we can attempt to devise a fix for it,
which may or may not be based on this patch.
Ok, I can do the testing. Do you expect 5.7-stable has all the needed
performance improvements?
Yes.
It does not, as far as we can tell.
We did a download test on an apu2 (small embedded AMD CPU, but with
aesni support). A WiFi station is in software-decrypt mode (ath10k-ct driver/firmware,
but ath9k would be valid to reproduce the issue as well.)
On our 5.4 kernel with the aesni patch applied, we get
about 220Mbps wpa2 download throughput. With open, we get about 260Mbps
download throughput.
On 5.7, without any aesni patch, we see about 116Mbps download wpa2 throughput,
and about 265Mbps open download throughput.
perf-top on 5.4 during download test with our aesni patch looks like this:
11.73% libc-2.29.so [.] __memset_sse2_unaligned_erms
4.79% [kernel] [k] _aesni_enc1
1.71% [kernel] [k] ___bpf_prog_run
1.66% [kernel] [k] memcpy
1.25% [kernel] [k] copy_user_generic_string
1.18% libjvm.so [.] InstanceKlass::oop_follow_contents
1.07% [kernel] [k] _aesni_enc4
0.98% [kernel] [k] csum_partial_copy_generic
0.96% libjvm.so [.] SpinPause
0.84% [kernel] [k] get_data_to_compute
0.81% libjvm.so [.] ParMarkBitMap::mark_obj
0.64% [kernel] [k] udp_sendmsg
0.62% [kernel] [k] __ip_append_data.isra.53
0.58% [kernel] [k] ipt_do_table
0.56% [kernel] [k] _aesni_inc
0.56% [kernel] [k] fib_table_lookup
0.55% [kernel] [k] __rcu_read_unlock
0.52% libc-2.29.so [.] __GI___strcmp_ssse3
0.50% [kernel] [k] igb_xmit_frame_ring
on 5.7, we see this:
11.36% libc-2.29.so [.] __memset_sse2_unaligned_erms
9.03% [kernel] [k] kernel_fpu_begin
4.75% libjvm.so [.] SpinPause
2.89% [kernel] [k] __crypto_xor
2.35% [kernel] [k] _aesni_enc1
1.94% [kernel] [k] copy_user_generic_string
1.29% [kernel] [k] aesni_encrypt
0.85% [kernel] [k] udp_sendmsg
0.85% [kernel] [k] crypto_cipher_encrypt_one
0.71% [kernel] [k] crypto_cbcmac_digest_update
0.69% [kernel] [k] __ip_append_data.isra.53
0.69% [kernel] [k] memcpy
0.68% [kernel] [k] crypto_ctr_crypt
0.61% [kernel] [k] irq_fpu_usable
0.58% [kernel] [k] ipt_do_table
0.55% [kernel] [k] __dev_queue_xmit
0.54% [kernel] [k] crypto_inc
0.49% libc-2.29.so [.] __GI___strcmp_ssse3
0.45% libjvm.so [.] InstanceKlass::oop_follow_contents
0.45% [kernel] [k] ip_route_output_key_hash_rcu
So, I think there is still some good improvement possible, likely with something like
the aesni patch I showed, but re-worked to function in 5.7+ kernels.
Thanks,
Ben
--
Ben Greear <greearb@xxxxxxxxxxxxxxx>
Candela Technologies Inc http://www.candelatech.com