On Sat, Sep 14, 2019 at 06:10:47PM -0700, Linus Torvalds wrote: > > We could return 0 for success, and yet "the best we > > can do" could be really terrible. > > Yes. Which is why we should warn. I'm all in favor of warning. But people might just ignore the warning. We warn today about systemd trying to read from /dev/urandom too early, and that just gets ignored. > But we can't *block*. Because that just breaks people. Like shown in > this whole discussion. I'd be willing to let it take at least 2 minutes, since that's slow enough to be annoying. I'd be willing to to kill the process which tried to call getrandom too early. But I believe blocking is better than returning something potentially not random at all. I think failing "safe" is extremely important. And returning something not random which then gets used for a long-term private key is a disaster. You basically want to turn getrandom into /dev/urandom. And that's how we got into the mess where 10% of the publically accessible ssh keys could be guessed. I've tried that already, and we saw how that ended. > Why is warning different? Because hopefully it tells the only person > who can *do* something about it - the original maintainer or developer > of the user space tools - that they are doing something wrong and need > to fix their broken model. Except the developer could (and *has) just ignored the warning, which is what happened with /dev/urandom when it was accessed too early. Even when I drew some developers attention to the warning, at least one just said, "meh", and blew me off. Would a making it be noiser (e.g., a WARN_ON) make enough of a difference? I guess I'm just not convinced. > Blocking doesn't do that. Blocking only makes the system unusable. And > yes, some security people think "unusable == secure", but honestly, > those security people shouldn't do system design. They are the worst > kind of "technically correct" incompetent. Which is worse really depends on your point of view, and what the system might be controlling. If access to the system could cause a malicious attacker to trigger a nuclear bomb, failing safe is always going to be better. In other cases, maybe failing open is certainly more convenient. It certainly leaves the system more "usable". But how do we trade off "usable" with "insecure"? There are times when "unusable" is WAY better than "could risk life or human safety". Would you be willing to settle for a CONFIG option or a boot-command line option which controls whether we fail "safe" or fail "open" if someone calls getrandom(2) and there isn't enough entropy? Then each distribution and/or system integrator can decide whether "proper systems design" considers "usability" versus "must not fail insecurely" to be more important. - Ted