Re: [PATCH] Improve endian conversion in umac.c

rapier <rapier@xxxxxxx> · Wed, 9 Mar 2022 15:02:31 -0500

On 3/8/22 6:12 PM, Darren Tucker wrote:
On Wed, 9 Mar 2022 at 09:59, rapier <rapier@xxxxxxx> wrote:
I was poking at the MAC routines looking for some efficiencies for high
performance environments. I was looking at the umac.c and comparing it
to the original source at https://fastcrypto.org/front/umac/umac.c After
a couple of false starts I found that reverting the endian conversion
routines back to what Krovetz wrote realized a 8% to 16% improvement

Interesting!  One obvious difference is what you have is potentially
inline-able static functions instead of function calls across
compilation units that (barring whole program optimization) can't be
inlined.  If you put the existing functions from misc.c into umac.c as
statics do you see the same improvement?

That worked and I saw the same improvement. For a 20GB test (a dd pipe 
with aes2560ctr) I'm seeing peaks at 870MB/s versus 720MB/s for stock. 
So it does look like that its being inlined. I'm going to poke at a 
couple more things and then provide an updated patch. I think I have a 
big endian system around here somewhere so I want to test on that as well.

This is pleasing. Initially I was looking at improving performance by 
pipelining the MAC but that's not possible with ETM. This is about the 
level of performance gain I was hoping to get with that and it's a lot 
easier.

Anyway, I'll get the new patch up soon.

Chris
_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@xxxxxxxxxxx
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev