On Monday, 20 March 2023 16:14:39 CET, Michael Wojcik via openssl-users
wrote:
From: openssl-users <openssl-users-bounces@xxxxxxxxxxx> On
Behalf Of Blumenthal, Uri - 0553 - MITLL
Sent: Monday, 20 March, 2023 08:28
Naïve questions, driven my current use of Apple Silicon
(includes AES, SHA1, SHA2, SHA3 extended instructions):
1. Does the current stable OpenSSL-3.1.0 include (assembly?)
code to take advantage, aka - utilize, these CPU instructions?
For 3.0.8, a quick look at Configurations/15-ios.conf shows it
uses the armv4 assembly configuration, and at
crypto/aes/asm/aes-armv4.pl suggests it's not using any
dedicated instruction. The comments at the top of the latter
list various performance improvements but nothing about using an
extended instruction if it's available.
Glancing at some search results it appears the dedicated
instruction can get down to 0.9 cycles/byte, whereas the OpenSSL
source states it reaches 21.5 cycles/byte, so using the
dedicated instruction would be a big performance gain -- if
those sources are comparing the same thing (one might be
including some portion of overhead excluded by the other), and
only when doing AES operations, of course. With TLS, for
example, I/O will typically dominate so speeding up may not do
much for many applications.
Now that said, there seem to be crypto/*/asm/*-armv8.pl files,
but 1) they don't seem to be used by any configurations, and 2)
the AES one (vpaes-armv8.pl) is a vectorized AES but doesn't
seem to use any dedicated instruction -- though I'm not at all
an ARM assembly programmer, so take that with a lot of salt.
2. How can I check whether openssl installation (binary and
libraries) are compiled with Silicon optimizations (if I did
not
compile from source myself)?
If I wanted to do this, I'd probably disassemble libcrypto on
the target platform and search for the symbol AES_encrypt, and
then look at the implementation, or just search the disassembly
for the instruction in question with a suitable regex search.
There might be an easier way.
actually you may want to rather look into OPENSSL_armcap and how it's used
OpenSSL doesn't accelerate just the underlying instructions, it accelerates
whole cipher modes or even combined encryption+MAC operations
3. What's the current analog of rdrand engine? I.e., does
OpenSSL take input from RDRAND and its analog on AARCH64,
and how can I check that it does?
RDRAND, yes, if OpenSSL was not built with no-rdrand. I don't
know what the analog might be on ARM.
Hopefully someone can provide more detailed and authoritative answers...
--
Regards,
Hubert Kario
Principal Quality Engineer, RHEL Crypto team
Web: www.cz.redhat.com
Red Hat Czech s.r.o., Purkyňova 99/71, 612 45, Brno, Czech Republic