Hi, I'm currently doing some performance benchmarking on a quad core Cortex A72 (Macchiatobin dev board) for rfc7539esp (ChachaPoly) and the relatively low performance kind of took me by surprise, considering how everyone keeps shouting how efficient Chacha-Poly is in software on modern CPU's. Then I noticed that it was using chacha20-generic for the encrypt direction, while a chacha20-neon implementation exists (it actually DOES use that one for decryption). Why would that be? Also, it also uses poly1305-generic in both cases. Is that the best possible on ARM64? I did a quick search in the codebase but couldn't find any ARM64 optimized version ... Regards, Pascal van Leeuwen Silicon IP Architect, Multi-Protocol Engines @ Verimatrix www.insidesecure.com