On Tue, Jan 28, 2020 at 06:24:06PM +0000, Dan Heinz wrote: > >RSA is not intended for bulk data decryption, its intended uses are > >key transport and signing. Bulk data decryption is done via AES or > >similar. It sounds like you're directly encrypting data with RSA. That's a mistake. RSA is for decrypting a symmetric algorithm key, that then decrypts the data. > >Are you sure that's seconds and not milliseconds? These are absurdly > >long times, almost certainly dominated by factors other than the > >encryption algorithms. On my 2015 laptop (MacOS) I get: > > Yes, it is seconds. Sorry, 0.6 seconds for a single 1024-bit RSA_private_decrypt() (128 bytes of data) is not plausible, but you say you have just over 8KB of data, which would take ~65 calls to RSA_private_decrypt() to decrypt piecewise. It sure looks like you're measuring something other than what you claim to be measuring, or not describing it accurately. OpenSSL 1.1.1c-dev xx XXX xxxx options:bn(64,64) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr) compiler: cc -fPIC -arch x86_64 -g -O0 -Wall -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -D_REENTRANT sign verify sign/s verify/s rsa 1024 bits 0.000135s 0.000013s 7414.8 78566.9 On my laptop RSA_private_decrypt (aka sign) takes 135 microseconds. You claim 600 milliseconds for perhaps ~60 calls, which might be 10ms each, but that still is about two orders of magnitude too slow. So, sorry whatever you're measuring, it is not the performance of RSA_private_decrypt(). > While I'm ok with the execution speed with OpenSSL 1.0.2, I'd like to > figure out why the times doubled with OpenSSL 1.1.1. Neither is a reasonable performance level, but also it is not reasonable to use RSA for bulk data encryption. > I'm logging times before and after the calls to RSA_private_decrypt. How many calls? What else is happening to feed the data into the decryption algorithm, and reassemble the output? > With OpenSSL 1.0.2 it takes on average about 4-8 milliseconds for each > RSA_private_decrypt call. With OpenSSL 1.1.1d, it takes 10-15 > milliseconds for each RSA_private_decrypt call. Now we see that you're in fact chunking data for multiple calls to "decrypt" via RSA. That's a fatal design flaw. This is not a valid operating mode for RSA. You MUST NOT do this. > >> I'm wondering if perhaps my build configuration is incorrect or > >> missing something for the 1.1.1d build. Here are the configuration > >> parameters for the 64-bit build: You have a deeper problem, your use of RSA is broken. > The data being decrypted is local on the client machine and is just an XML file. > RSA key is 1024 bits. > I'm using OAEP padding. This is a mistake, for asymmetric encryption you should be using CMS. > Thank you for the information. I removed it from the configuration > parameters. I didn't really notice a difference in execution time > though. I also removed the no-asm parameter, setup nasm, and rebuilt > with no noticeable changes. Likely the time is dominated by something other than the RSA operations, but since those are mistake anyway, it hardly matters. > > I logged things granular enough to see the speed difference was in > > RSA_private_decrypt, but I'm not sure why it is so much slower with > > 1.1.1d. Any help or ideas would be appreciated! STOP. Fix your design to use CMS. Report any performance differences in CMS between 1.0.2 and 1.1.1 when built correctly with asm support. > >At 600ms for 8KB, it is not plausible that the time is spend doing > >cryptography. That's barely fast enough to feed a 1980's modem. > > I would expect the execution times to be more in line with what I saw > with Linux for both 1.0.2 and 1.1.1. But even so, I do not understand > why just upgrading to 1.1.1 causes the RSA_private_decrypt calls to > double in execution time from what they were with 1.0.2? I would expect execution times that are 2 to 3 orders of magnitude faster, especially if you were using sound cryptographic primitives. -- Viktor.