Re: OpenSSL 1.1.1g Windows build slow rsa tests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Dan,

On 21/01/21 19:22, Dan Heinz wrote:
[...]

Thank you all for the helpful suggestions. When I removed no-asm and built using nmake in the Developer Command Prompt for Visual Studio 2015, I ended up getting an error "VC-WIN64A X86 conflicts with target x64". From the command prompt I ran cl and saw this "Microsoft (R) C/C++ Optimizing Compiler Version 19.00.24215.1 for x86". So I was building for x86? I'm not sure why it built with no-asm, but it did.
Once I ran the correct command prompt (I used Visual Studio x64 Native Tools Command Prompt), I saw a huge speed increase.  For example, 2048 bits:
Doing 2048 bits private rsa's for 10s: 8384 2048 bits private RSA's in 10.02s
Doing 2048 bits public rsa's for 10s: 236090 2048 bits public RSA's in 9.98s

Previously, I saw:
Doing 2048 bits private rsa's for 10s: 409 2048 bits private RSA's in 10.00s
Doing 2048 bits public rsa's for 10s: 15663 2048 bits public RSA's in 10.02s

For further testing, I added back no-asm and my speed tests were in line with the downloaded openssl binary I was testing with.
Doing 2048 bits private rsa's for 10s: 1868 2048 bits private RSA's in 10.00s
Doing 2048 bits public rsa's for 10s: 71338 2048 bits public RSA's in 10.02s

You can see removing no-asm does make a pretty large speed increase too.

In summary, using the correct build tools helps (although I am surprised it built with no-asm).  And removing no-asm sped things up.

Not sure why you'd want to do a 'no-asm' build to begin with, but another thing worth testing with your "asm" build is to run the speed test like this:
 set OPENSSL_ia32cap=0
 openssl speed rsa
(Linux/UNIX:  OPENSSL_ia32cap=0 openssl speed rsa)

On my (10th gen Intel ) laptop this gives me a ~35% performance hit. Explanation: - no-asm build -> compiler generates all code, no hand-tuned assembly used at all; should be slowest

- asm build + OPENSSL_ia32cap=0  -> no newer CPU features used, but hand-tuned assembly is used. Especially AES encryption takes a hit if you disable these newer features

- asm build -> hand-tuned assemby, including the use of all new CPU features such as AES, SHA etc.

I've found that this sometimes helps manage expectations when the "build environment" CPU and the "runtime environment" CPU are very different. I've seen a developer claim his/her code runs blazingly fast on his/her Core i7 bla bla but when deploying it on a cheaper runtime device performance is terrible.

Note that no-asm + OPENSSL_ia32cap=0 should not have any effect compared to "no-asm".

JM2CW,

JJK / Jan Just Keijser




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux