On Mon, 2022-05-02 at 16:06 +0200, Jason A. Donenfeld wrote: > In order to inform userspace of virtual machine forks, this commit adds > a "fork_event" sysctl, which does not return any data, but allows > userspace processes to poll() on it for notification of VM forks. > > It avoids exposing the actual vmgenid from the hypervisor to userspace, > in case there is any randomness value in keeping it secret. Rather, > userspace is expected to simply use getrandom() if it wants a fresh > value. > > For example, the following snippet can be used to print a message every > time a VM forks, after the RNG has been reseeded: > > struct pollfd fd = { .fd = open("/proc/sys/kernel/random/fork_event", O_RDONLY) }; > assert(fd.fd >= 0); > for (;;) { > read(fd.fd, NULL, 0); > assert(poll(&fd, 1, -1) > 0); > puts("vm fork detected"); > } > > Various programs and libraries that utilize cryptographic operations > depending on fresh randomness can invalidate old keys or take other > appropriate actions when receiving that event. While this is racier than > allowing userspace to mmap/vDSO the vmgenid itself, it's an incremental > step forward that's not as heavyweight. At your request teleporting here the answer I gave on a different thread, reinforced by some thinking. As a user space crypto library person I think the only reasonable interface is something like a vDSO. Poll() interfaces are nice and all for system programs that have full control of their event loop and do not have to react immediately to this event, however crypto libraries do not have the luxury of controlling the main loop of the application. Additionally crypto libraries really need to ensure the value they return from their PRNG is fine, which means they do not return a value if the vmgenid has changed before they can reseed, or there could be catastrophic duplication of "random" values used in IVs or ECDSA Signatures or ids/cookies or whatever. For crypto libraries it is much simpler to poll for this information than using notifications of any kind given libraries are generally not in full control of what the process does. This needs to be polled fast as well, because the whole point of initializing a PRNG in the library is that asking /dev/urandom all the time is too slow (due to context switches and syscall overhead), so anything that would require a context switch in order to pull data from the PRNG would not really fly. A vDSO or similar would allow to pull the vmgenid or whatever epoch value in before generating the random numbers and then barrier-style check that the value is still unchanged before returning the random data to the caller. This will reduce the race condition (which simply cannot be completely avoided) to a very unlikely event. HTH, Simo. -- Simo Sorce RHEL Crypto Team Red Hat, Inc