Re: [PATCH] crypto: x86/aesni - implement accelerated CBCMAC, CMAC and XCBC shashes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 23 Sep 2020 at 13:03, Ben Greear <greearb@xxxxxxxxxxxxxxx> wrote:
>
> On 8/4/20 12:45 PM, Ben Greear wrote:
> > On 8/4/20 6:08 AM, Ard Biesheuvel wrote:
> >> On Tue, 4 Aug 2020 at 15:01, Ben Greear <greearb@xxxxxxxxxxxxxxx> wrote:
> >>>
> >>> On 8/4/20 5:55 AM, Ard Biesheuvel wrote:
> >>>> On Mon, 3 Aug 2020 at 21:11, Ben Greear <greearb@xxxxxxxxxxxxxxx> wrote:
> >>>>>
> >>>>> Hello,
> >>>>>
> >>>>> This helps a bit...now download sw-crypt performance is about 150Mbps,
> >>>>> but still not as good as with my patch on 5.4 kernel, and fpu is still
> >>>>> high in perf top:
> >>>>>
> >>>>>       13.89%  libc-2.29.so   [.] __memset_sse2_unaligned_erms
> >>>>>         6.62%  [kernel]       [k] kernel_fpu_begin
> >>>>>         4.14%  [kernel]       [k] _aesni_enc1
> >>>>>         2.06%  [kernel]       [k] __crypto_xor
> >>>>>         1.95%  [kernel]       [k] copy_user_generic_string
> >>>>>         1.93%  libjvm.so      [.] SpinPause
> >>>>>         1.01%  [kernel]       [k] aesni_encrypt
> >>>>>         0.98%  [kernel]       [k] crypto_ctr_crypt
> >>>>>         0.93%  [kernel]       [k] udp_sendmsg
> >>>>>         0.78%  [kernel]       [k] crypto_inc
> >>>>>         0.74%  [kernel]       [k] __ip_append_data.isra.53
> >>>>>         0.65%  [kernel]       [k] aesni_cbc_enc
> >>>>>         0.64%  [kernel]       [k] __dev_queue_xmit
> >>>>>         0.62%  [kernel]       [k] ipt_do_table
> >>>>>         0.62%  [kernel]       [k] igb_xmit_frame_ring
> >>>>>         0.59%  [kernel]       [k] ip_route_output_key_hash_rcu
> >>>>>         0.57%  [kernel]       [k] memcpy
> >>>>>         0.57%  libjvm.so      [.] InstanceKlass::oop_follow_contents
> >>>>>         0.56%  [kernel]       [k] irq_fpu_usable
> >>>>>         0.56%  [kernel]       [k] mac_do_update
> >>>>>
> >>>>> If you'd like help setting up a test rig and have an ath10k pcie NIC or ath9k pcie NIC,
> >>>>> then I can help.  Possibly hwsim would also be a good test case, but I have not tried
> >>>>> that.
> >>>>>
> >>>>
> >>>> I don't think this is likely to be reproducible on other
> >>>> micro-architectures, so setting up a test rig is unlikely to help.
> >>>>
> >>>> I'll send out a v2 which implements a ahash instead of a shash (and
> >>>> implements some other tweaks) so that kernel_fpu_begin() is only
> >>>> called twice for each packet on the cbcmac path.
> >>>>
> >>>> Do you have any numbers for the old kernel without your patch? This
> >>>> pathological FPU preserve/restore behavior could be caused be the
> >>>> optimizations, or by other changes that landed in the meantime, so I
> >>>> would like to know if kernel_fpu_begin() is as prominent in those
> >>>> traces as well.
> >>>>
> >>>
> >>> This same patch makes i7 mobile processors able to handle 1Gbps+ software
> >>> decrypt rates, where without the patch, the rate was badly constrained and CPU
> >>> load was much higher, so it is definitely noticeable on other processors too.
> >>
> >> OK
> >>
> >>> The weak processor on the current test rig is convenient because the problem
> >>> is so noticeable even at slower wifi speeds.
> >>>
> >>> We can do some tests on 5.4 with our patch reverted.
> >>>
> >>
> >> The issue with your CCM patch is that it keeps the FPU enabled for the
> >> entire input, which also means that preemption is disabled, which
> >> makes the -rt people grumpy. (Of course, it also uses APIs that no
> >> longer exists, but that should be easy to fix)
> >>
> >> Do you happen to have any ballpark figures for the packet sizes and
> >> the time spent doing encryption?
> >>
> >
> > My tester reports this last patch appears to break wpa-2 entirely, so we
> > cannot test it as is.
>
> Hello,
>
> I'm still hoping that I can find some one to help enable this feature again
> on 5.9-ish kernels.
>
> I don't care about preemption being disabled for long-ish times, that is no worse than what I had
> before, so even if an official in-kernel patch is not possible, I can carry an
> out-of-tree patch.
>

Just a note that this is still on my stack, just not at the very top :-)



[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]

  Powered by Linux