Op do 18 jun. 2020 om 11:16 schreef Guido Vranken <guidovranken@xxxxxxxxx>: > > I think this could be an issue with the system's /dev/urandom or entropy, as I've observed similar infinite loops in BN_prime when I changed OpenSSL code to always return the same sequence of bytes from its PRNG (for testing purposes). It could also be a genuine bug in OpenSSL, or both. I'll let others comment on that. > The HW device that should generate entropy is enabled in the kernel: ~ # zcat /proc/config.gz | grep RANDOM_ CONFIG_HW_RANDOM_TIMERIOMEM=y CONFIG_HW_RANDOM_OCTEON=y and the daemon to populate the data is also running: ~ # ps | grep rngd 3193 root /usr/sbin/rngd Doing the test on the /dev/random also works well: ~ # time dd if=/dev/random of=./out3 bs=1024 count=1 iflag=fullblock 1+0 records in 1+0 records out real 0m 0.02s user 0m 0.00s sys 0m 0.00s Note that without the daemon operational the dd takes very long so it looks like the mechanism to generate entropy from the HW is working well. When I do an strace on the dd command without the rngd tool running I see: ~ # strace -t dd if=/dev/random of=./out3 bs=1024 count=1 iflag=fullblock ... 12:49:29 openat(AT_FDCWD, "/dev/random", O_RDONLY|O_LARGEFILE) = 3 ... 12:49:29 read(0, "-\335\265BA~Wl\253_\325&$\261\301\6\216\303\326\24q\331\233h\25\205\32(u\343@!"..., 1024) = 72 12:49:29 read(0, "\356\336\32\321\305\304", 952) = 6 12:49:30 read(0, "\233\330\20\240n\312", 946) = 6 12:49:31 read(0, "\25\215A\32\241\246", 940) = 6 12:49:31 read(0, "\350\272\352\350\354V", 934) = 6 12:49:31 read(0, "\274\334u\262\337V", 928) = 6 12:49:31 read(0, "N\243\200\16D>", 922) = 6 12:49:32 read(0, "\34F\333\n%i", 916) = 6 12:49:32 read(0, "\220\263\344\"\216\374", 910) = 6 12:49:32 read(0, "\27|\305\374V\272", 904) = 6 12:49:32 read(0, "\335\27\374\234\273\356", 898) = 6 12:49:32 read(0, "So\263\242|\207", 892) = 6 12:49:32 read(0, "\207\33\375\236mz", 886) = 6 12:49:34 read(0, "H\375\203v\344J", 880) = 6 12:49:35 read(0, "?o\3\326\334\2", 874) = 6 12:49:36 read(0, ";\22\312\314\237\312", 868) = 6 > On Thu, Jun 18, 2020 at 9:47 AM Ronny Meeus <ronny.meeus@xxxxxxxxx> wrote: >> >> Hello >> >> we are in the process of upgrading our openssl to version 1.1.1g. >> On one of our architectures (Cavium MIPS, running kernel 4.9) we have >> an issue in the ssh-keygen tool. It keeps on consuming 100% CPU of 1 >> core. >> On other architectures we do not see the issue at all. >> >> I instrumented the openssl library with some traces and observed that >> it keeps on looping in the "probable prime" function. >> At the end of the function the "BN_num_bits" check is done and if the >> return value is not equal to "bits" it basically starts all over >> again. >> >> } >> if (!BN_add_word(rnd, delta)) >> return 0; >> if (BN_num_bits(rnd) != bits) { >> printf("%s BN_num_bits %d %d\n", __FUNCTION__, BN_num_bits(rnd), bits); >> goto again; >> } >> bn_check_top(rnd); >> return 1; >> } >> >> I added the print function and the result of the print is as follows: >> probable_prime BN_num_bits 1473 1536 >> This trace keeps on going forever and the values never change. >> >> Any idea what could be the underlying root-cause? >> >> Many thanks and best regards, >> Ronny