On Thu, 26 Sep 2019 at 16:55, Pascal Van Leeuwen <pvanleeuwen@xxxxxxxxxxxxxx> wrote: > > Hi, > > I'm currently doing some performance benchmarking on a quad core Cortex > A72 (Macchiatobin dev board) for rfc7539esp (ChachaPoly) and the > relatively low performance kind of took me by surprise, considering how > everyone keeps shouting how efficient Chacha-Poly is in software on > modern CPU's. > > Then I noticed that it was using chacha20-generic for the encrypt > direction, while a chacha20-neon implementation exists (it actually > DOES use that one for decryption). Why would that be? > > Also, it also uses poly1305-generic in both cases. Is that the best > possible on ARM64? I did a quick search in the codebase but couldn't > find any ARM64 optimized version ... > The Poly1305 implementation is part of the 18 piece WireGuard series I just sent out yesterday (which I know you have seen :-)) The Chacha20 code should be used in preference to the generic code, so if you end up with the wrong version, there's a bug somewhere we need to fix. Also, how do you know which direction uses which transform? What are the refcounts for the transforms in /proc/crypto?