16.09.2019 22:21, Theodore Y. Ts'o пишет:
On Mon, Sep 16, 2019 at 09:17:10AM -0700, Linus Torvalds wrote:So the semantics that getrandom() should have had are: getrandom(0) - just give me reasonable random numbers for any of a million non-strict-long-term-security use (ie the old urandom) - the nonblocking flag makes no sense here and would be a no-opThat change is what I consider highly problematic. There are a *huge* number of applications which use cryptography which assumes that getrandom(0) means, "I'm guaranteed to get something safe cryptographic use". Changing his now would expose a very large number of applications to be insecure. Part of the problem here is that there are many different actors. There is the application or cryptographic library developer, who may want to be sure they have cryptographically secure random numbers. They are the ones who will select getrandom(0). Then you have the distribution or consumer-grade electronics developers who may choose to run them too early in some init script or systemd unit files. And some of these people may do something stupid, like run things too early, or omit the a hardware random number generator in their design, even though it's for a security critical purpose (say, a digital wallet for bitcoin). Because some of these people might do something stupid, one argument (not mine) is that we must therefore not let getrandom() block. But doing this penalizes the security of all the users of the application, not just the stupid ones.
On Linux, there is no such thing as "too early", that's the problem.First, we already had one lesson about this, regarding applications that require libraries from /usr. There, it was due to various programs that run from udev rules, and dynamic/unpredictable dependencies. See https://freedesktop.org/wiki/Software/systemd/separate-usr-is-broken/, almost all arguments from there apply 1:1 here.
Second, people/distributions put unexpected stuff into their initramfs images, and we cannot say that they have no right to do so. E.g., on my system that's "cryptsetup" that unlocks the root partition, but manages to read a few bytes of uninitialized urandom before that. A warning here is almost unavoidable, and thus will be treated as SPAM.
No such considerations apply to OpenBSD (initramfs does not exist, and there is no equivalent of udev that reacts to cold-plug events by running programs), that's why the getentropy() design works there.
If we were to fix it, we should focus on making true entropy available unconditionally, even before /init in the initramfs starts, and warn not on the first access to urandom, but on the exec of /init. Look - distributions are already running "haveged" which harvests entropy from clock jitter. And they still manage to do it wrong (regardless whether the "haveged" idea is wrong by itself), by running it too late (at least I don't know any kind of stock initramfs with either it or rngd included). So it's too complex, and needs to be simplified.
The kernel already has jitterentropy-rng, it uses the same idea as "haveged", but, alas, it is exposed as a crypto rng algorithm, not a hwrng. And I think it is a bug: cryptoapi rng algorithms are for things that get a seed and generate random numbers by rehashing it over and over, while jitterentropy-rng requires no seed. Would a patch be accepted to convert it to hwrng? (this is essentially the reverse of what commit c46ea13 did for exynos-rng)
getrandom(GRND_RANDOM) - get me actual _secure_ random numbers with blocking until entropy pool fills (but not the completely invalid entropy decrease accounting) - the nonblocking flag is useful for bootup and for "I will actually try to generate entropy". and both of those are very very sensible actions. That would actually have _fixed_ the problems we had with /dev/[u]random, both from a performance standpoint and for a filesystem access standpoint. But that is sadly not what we have right now. And I suspect we can't fix it, since people have grown to depend on the old behavior, and already know to avoid GRND_RANDOM because it's useless with old kernels even if we fixed it with new ones.I don't think we can fix it, because it's the changing of getrandom(0)'s behavior which is the problem, not GRND_RANDOM. People *expect* getrandom(0) to always return secure results. I don't think we can make it sometimes return not-necessarily secure results depending on when the systems integrator or distribution decides to run the application, and depending on the hardware platform (yes, traditional x86 systems are probably fine, and fortunately x86 embedded CPU are too expensive and have lousy power management, so no one really uses x86 for embedded yet, despite Intel's best efforts). That would just be a purely irresponsible thing to do, IMO.Does anybody really seriously debate the above? Ted? Are you seriously trying to claim that the existing GRND_RANDOM has any sensible use? Are you seriously trying to claim that the fact that we don't have a sane urandom source is a "feature"?There are people who can debate that GRND_RANDOM has any sensible use cases. GPG uses /dev/random, and that was a fully informed choice. I'm not convinced, because I think that at least for now the CRNG is perfectly fine for 99.999% of the use cases. Yes, in a post-quantum cryptography world, the CRNG might be screwed --- but so will most of the other cryptographic algorithms in the kernel. So if anyone ever gets post-quantum cryptoanalytic attacks working, the use of the CRNG is going to be least of our problems. As I mentioned to you in Lisbon, I've been going back and forth about whether or not to rip out the entire /dev/random infrastructure, mainly for code maintainability reasons. The only reason why I've been holding back is because there are (very few) non-insane people who do want to use it. There are also a much larger of rational people who use it because they want some insane PCI compliance labs to go away. What I suspect most of them are actually doing in practice is they use /dev/random, but they also use a hardware random number generator so /dev/random never actually blocks in practice. The use of /dev/random is enough to make the PCI compliance lab go away, and the hardware random number generator (or virtio-rng on a VM) makes /dev/random useable.
Please don't forget about people who run Linux on Hyper-V, not on KVM, and thus have no access to virtio-rng ;)
But I don't think we can reuse GRND_RANDOM for that reason. We could create a new flag, GRND_INSECURE, which never blocks. And that that allows us to solve the problem for silly applications that are using getrandom(2) for non-cryptographic use cases. Use cases might include Python dictionary seeds, gdm for MIT Magic Cookie, UUID generation where best efforts probably is good enough, etc. The answer today is they should just use /dev/urandom, since that exists today, and we have to support it for backwards compatibility anyway. It sounds like gdm recently switched to getrandom(2), and I suspect that it's going to get caught on some hardware configs anyway, even without the ext4 optimization patch. So I suspect gdm will switch back to /dev/urandom, and this particular pain point will probably go away. - Ted
Well, at this point, I see that there is a lot of disagreement about how getrandom() should behave, aggravated by the baggage of existing applications and libraries with contradictory requirements regarding getrandom(0) (so not really solvable). I am almost convinced that we might want to return -ENOSYS unconditionally, and create a different system call with sane flags.
-- Alexander E. Patrakov
Attachment:
smime.p7s
Description: Криптографическая подпись S/MIME