* Jason A. Donenfeld: > + pfd.fd = TEMP_FAILURE_RETRY ( > + __open64_nocancel ("/dev/random", O_RDONLY | O_CLOEXEC | O_NOCTTY)); > + if (pfd.fd < 0) > + arc4random_getrandom_failure (); > + if (__poll (&pfd, 1, -1) < 0) > + arc4random_getrandom_failure (); > + if (__close_nocancel (pfd.fd) < 0) > + arc4random_getrandom_failure (); What happens if /dev/random is actually /dev/urandom? Will the poll call fail? I think we need a no-cancel variant of poll here, and we also need to handle EINTR gracefully. Performance-wise, my 1000 element shuffle benchmark runs about 14 times slower without userspace buffering. (For comparison, just removing ChaCha20 while keeping a 256-byte buffer makes it run roughly 25% slower than current master.) Our random() implementation is quite slow, so arc4random() as a replacement call is competitive. The unbuffered version, not so much. Running the benchmark, I see 40% of the time spent in chacha_permute in the kernel, that is really quite odd. Why doesn't the system call overhead dominate? Thanks, Florian