On Sat, Sep 14, 2019 at 8:02 AM Ahmed S. Darwish <darwish.07@xxxxxxxxx> wrote: > > On Thu, Sep 12, 2019 at 12:34:45PM +0100, Linus Torvalds wrote: > > > > An alternative might be to make getrandom() just return an error > > instead of waiting. Sure, fill the buffer with "as random as we can" > > stuff, but then return -EINVAL because you called us too early. > > ACK, that's probably _the_ most sensible approach. Only caveat is > the slight change in user-space API semantics though... > > For example, this breaks the just released systemd-random-seed(8) > as it _explicitly_ requests blocking behvior from getrandom() here: > Actually, I would argue that the "don't ever block, instead fill buffer and return error instead" fixes this broken case. > => src/random-seed/random-seed.c: > /* > * Let's make this whole job asynchronous, i.e. let's make > * ourselves a barrier for proper initialization of the > * random pool. > */ > k = getrandom(buf, buf_size, GRND_NONBLOCK); > if (k < 0 && errno == EAGAIN && synchronous) { > log_notice("Kernel entropy pool is not initialized yet, " > "waiting until it is."); > > k = getrandom(buf, buf_size, 0); /* retry synchronously */ > } Yeah, the above is yet another example of completely broken garbage. You can't just wait and block at boot. That is simply 100% unacceptable, and always has been, exactly because that may potentially mean waiting forever since you didn't do anything that actually is likely to add any entropy. > if (k < 0) { > log_debug_errno(errno, "Failed to read random data with " > "getrandom(), falling back to " > "/dev/urandom: %m"); At least it gets a log message. So I think the right thing to do is to just make getrandom() return -EINVAL, and refuse to block. As mentioned, this has already historically been a huge issue on embedded devices, and with disks turnign not just to NVMe but to actual polling nvdimm/xpoint/flash, the amount of true "entropy" randomness we can give at boot is very questionable. We can (and will) continue to do a best-effort thing (including very much using rdread and friends), but the whole "wait for entropy" simply *must* stop. > I've sent an RFC patch at [1]. > > [1] https://lkml.kernel.org/r/20190914122500.GA1425@darwi-home-pc Looks reasonable to me. Except I'd just make it simpler and make it a big WARN_ON_ONCE(), which is a lot harder to miss than pr_notice(). Make it clear that it is a *bug* if user space thinks it should wait at boot time. Also, we might even want to just fill the buffer and return 0 at that point, to make sure that even more broken user space doesn't then try to sleep manually and turn it into a "I'll wait myself" loop. Linus