Re: [PATCH 2/2] random: add fork_event sysctl for polling VM forks

Alexander Graf <graf@xxxxxxxxxx> · Wed, 11 May 2022 15:20:46 +0200

On 11.05.22 03:18, Jason A. Donenfeld wrote:
Hi Simo,

On Tue, May 10, 2022 at 08:40:48PM -0400, Simo Sorce wrote:
At your request teleporting here the answer I gave on a different
thread, reinforced by some thinking.

As a user space crypto library person I think the only reasonable
interface is something like a vDSO.

Poll() interfaces are nice and all for system programs that have full
control of their event loop and do not have to react immediately to
this event, however crypto libraries do not have the luxury of
controlling the main loop of the application.

Additionally crypto libraries really need to ensure the value they
return from their PRNG is fine, which means they do not return a value
if the vmgenid has changed before they can reseed, or there could be
catastrophic duplication of "random" values used in IVs or ECDSA
Signatures or ids/cookies or whatever.

For crypto libraries it is much simpler to poll for this information
than using notifications of any kind given libraries are
generally not in full control of what the process does.

This needs to be polled fast as well, because the whole point of
initializing a PRNG in the library is that asking /dev/urandom all the
time is too slow (due to context switches and syscall overhead), so
anything that would require a context switch in order to pull data from
the PRNG would not really fly.

A vDSO or similar would allow to pull the vmgenid or whatever epoch
value in before generating the random numbers and then barrier-style
check that the value is still unchanged before returning the random
data to the caller. This will reduce the race condition (which simply
cannot be completely avoided) to a very unlikely event.
It sounds like your library issue is somewhat similar to what Alex was
talking about with regards to having a hard time using poll in s2n. I'm
still waiting to hear if Amazon figured out some way that this is
possible (with, e.g., a thread). But anyway, it seems like this is

Sounds like I didn't reply with my findings - sorry about that. Our s2n 
people *could* build something based on a thread, but are afraid that 
it's racy and would introduce creating a thread which the library does 
not do today.

So in a nutshell, possible yes, desirable no.

I think we're maybe a bit too scared of building something from scratch 
here. What would the best case situation be? Let's roll backwards from 
that then.

From what I read, we want a "VMGenID v2" device that gives us the 
ability to

  * Get an IRQ on VM clone
  * Store and update an RNG seed value (128bit? Configurable len?) at a 
physical address or stand alone page on clone
  * Store and update a unique-to-this-VM rolling 32bit identifier at a 
physical address or stand alone page on clone

We can either make the device expose these values as individual pages 
(like VMGenID does today) or as guest physical addresses that it needs 
to store into (like virtio-rng). The latter makes protection from DMA 
attacks of the hypervisor and kexec slightly more complicated, but it 
would be doable.

VMGenID has 2 out of 3 features above. So why don't we just go the easy 
route, add a second property to VMGenID that gives us another page with 
that 32bit value and then provide a /dev/vmgenid device node you can 
open and mmap() that 32bit value page from?

User space libraries can then try to open on init and determine their epoch.
For the async event, we add the poll() logic again to /dev/vmgenid and 
make networkd for example use that.

Alex

Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879