On Fri, 31 Dec 2021 at 12:50, Jason A. Donenfeld <Jason@xxxxxxxxx> wrote: > > RDRAND is not fast. RDRAND is actually quite slow. We've known this for > a while, which is why functions like get_random_u{32,64} were converted > to use batching of our ChaCha-based CRNG instead. > > Yet CRNG extraction still includes a call to RDRAND, in the hot path of > every call to get_random_bytes(), /dev/urandom, and getrandom(2). > > This call to RDRAND here seems quite superfluous. CRNG is already > extracting things based on a 256-bit key, based on good entropy, which > is then reseeded periodically, updated, backtrack-mutated, and so > forth. The CRNG extraction construction is something that we're already > relying on to be secure and solid. If it's not, that's a serious > problem, and it's unlikely that mixing in a measly 32 bits from RDRAND > is going to alleviate things. > > And in the case where the CRNG doesn't have enough entropy yet, we're > already initializing the ChaCha key row with RDRAND in > crng_init_try_arch_early(). > > Removing the call to RDRAND improves performance on an i7-11850H by > 370%. In other words, the vast majority of the work done by > extract_crng() prior to this commit was devoted to fetching 32 bits of > RDRAND. > > Signed-off-by: Jason A. Donenfeld <Jason@xxxxxxxxx> > --- > drivers/char/random.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/drivers/char/random.c b/drivers/char/random.c > index 4de0feb69781..17ec60948795 100644 > --- a/drivers/char/random.c > +++ b/drivers/char/random.c > @@ -1023,7 +1023,7 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r) > static void _extract_crng(struct crng_state *crng, > __u8 out[CHACHA_BLOCK_SIZE]) > { > - unsigned long v, flags, init_time; > + unsigned long flags, init_time; > > if (crng_ready()) { > init_time = READ_ONCE(crng->init_time); > @@ -1033,8 +1033,6 @@ static void _extract_crng(struct crng_state *crng, > &input_pool : NULL); > } > spin_lock_irqsave(&crng->lock, flags); > - if (arch_get_random_long(&v)) > - crng->state[14] ^= v; > chacha20_block(&crng->state[0], out); > if (crng->state[12] == 0) > crng->state[13]++; Given that arch_get_random_long() may be backed by other things than special instructions on some architectures/platforms, avoiding it if we can on any path that may be a hot path is good, so Acked-by: Ard Biesheuvel <ardb@xxxxxxxxxx>