On 21.07.2017 11:26, Jan Glauber wrote:
Nice catch. How much does the performance improve on Ryzen when you
use arch_get_random_int()?
Okay, now I have some results for you:
On Ryzen 1800X (using arch_get_random_int()):
---
# dd if=/dev/urandom of=/dev/null bs=1M status=progress
8751415296 bytes (8,8 GB, 8,2 GiB) copied, 71,0079 s, 123 MB/s
# perf top
57,37% [kernel] [k] _extract_crng
26,20% [kernel] [k] chacha20_block
---
Better, but obviously there is still much room for improvement by
reducing the number of calls to RDRAND.
On Ryzen 1800X (with nordrand kernel option):
---
# dd if=/dev/urandom of=/dev/null bs=1M status=progress
22643998720 bytes (23 GB, 21 GiB) copied, 67,0025 s, 338 MB/s
---
Here is the patch I used:
--- drivers/char/random.c.orig 2017-07-03 01:07:02.000000000 +0200
+++ drivers/char/random.c 2017-07-21 11:57:40.541677118 +0200
@@ -859,13 +859,14 @@
static void _extract_crng(struct crng_state *crng,
__u8 out[CHACHA20_BLOCK_SIZE])
{
- unsigned long v, flags;
+ unsigned int v;
+ unsigned long flags;
if (crng_init > 1 &&
time_after(jiffies, crng->init_time + CRNG_RESEED_INTERVAL))
crng_reseed(crng, crng == &primary_crng ? &input_pool
: NULL);
spin_lock_irqsave(&crng->lock, flags);
- if (arch_get_random_long(&v))
+ if (arch_get_random_int(&v))
crng->state[14] ^= v;
chacha20_block(&crng->state[0], out);
if (crng->state[12] == 0)