Re: [PATCH v16 4/5] random: introduce generic vDSO getrandom() implementation

Andy Lutomirski <luto@xxxxxxxxxxxxxx> · Fri, 7 Jun 2024 11:39:10 -0700

More comments...

On Tue, May 28, 2024 at 5:25 AM Jason A. Donenfeld <Jason@xxxxxxxxx> wrote:
>
> Provide a generic C vDSO getrandom() implementation, which operates on
> an opaque state returned by vgetrandom_alloc() and produces random bytes
> the same way as getrandom(). This has a the API signature:
>
>   ssize_t vgetrandom(void *buffer, size_t len, unsigned int flags, void *opaque_state);
>
> The return value and the first 3 arguments are the same as ordinary
> getrandom(), while the last argument is a pointer to the opaque
> allocated state. Were all four arguments passed to the getrandom()
> syscall, nothing different would happen, and the functions would have
> the exact same behavior.
>
> The actual vDSO RNG algorithm implemented is the same one implemented by
> drivers/char/random.c, using the same fast-erasure techniques as that.
> Should the in-kernel implementation change, so too will the vDSO one.
>
> It requires an implementation of ChaCha20 that does not use any stack,
> in order to maintain forward secrecy if a multi-threaded program forks
> (though this does not account for a similar issue with SA_SIGINFO
> copying registers to the stack), so this is left as an
> architecture-specific fill-in. Stack-less ChaCha20 is an easy algorithm
> to implement on a variety of architectures, so this shouldn't be too
> onerous.

Can you clarify this, because I'm a bit confused.  First, if a
multi-threaded program forks, bascially all bets are off -- fork() is
extremely poorly behaved in multithreaded programs, and the child
should take care to execve() or exit() in short order.  But more to
the point: If I do:

some_bytes = get_awesome_random_bytes();
<-- other thread forks here!

The bytes get copied.  Is the concern that the fork might happen *in
the middle* of the vDSO code, causing the child to end up
inadvertently possessing a copy of the parent's random state and thus
being able to predict future outputs?  If so, I think this could be
much more cleanly fixed by making sure that the vDSO state gets wiped
*for the parent and the child* on a fork.

> +       /*
> +        * If @state->generation doesn't match the kernel RNG's generation, then it means the
> +        * kernel's RNG has reseeded, and so @state->key is reseeded as well.
> +        */
> +       if (unlikely(state->generation != current_generation)) {
> +               /*
> +                * Write the generation before filling the key, in case of fork. If there is a fork
> +                * just after this line, the two forks will get different random bytes from the
> +                * syscall, which is good. However, were this line to occur after the getrandom
> +                * syscall, then both child and parent could have the same bytes and the same
> +                * generation counter, so the fork would not be detected. Therefore, write
> +                * @state->generation before the call to the getrandom syscall.
> +                */
> +               WRITE_ONCE(state->generation, current_generation);

Farther down the thread I think you were saying this had something to
do with signals, not fork.  As for fork, if you make sure that
rng_info->generation can never be 0, then, after a fork, the vDSO will
always retry or fall back, and I think there will be no complexity in
the middle related to forking, which could end up simplifying a few
things.