On 2022-02-17 13:27:29 [+0100], Jason A. Donenfeld wrote: > Sebastian - this v5 finally follows your suggestion about what > operations to do at which phase. The only deviation from your exact > suggestion is that I'm not checking for MIX_INFLIGHT in the online > handler, and instead just unconditionally zero it out. I think that's an > acceptable tradeoff to make for simplicity, and it just means we'll > accumulate even more entropy, which is fine. Hopefully this is an easy > ack and has no more pitfalls! -Jason So I think this is the latest, right? What do you think about this small comment update? :) diff --git a/drivers/char/random.c b/drivers/char/random.c index 373af789da7ab..3fb14ac1074c5 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -698,10 +698,6 @@ u32 get_random_u32(void) EXPORT_SYMBOL(get_random_u32); #ifdef CONFIG_SMP -/* - * This function is called by the cpuhp system, wired up via the large - * static array in kernel/cpu.c, with the entry CPUHP_RANDOM_PREPARE. - */ int random_prepare_cpu(unsigned int cpu) { /* @@ -1238,17 +1234,18 @@ static void fast_mix(u32 pool[4]) static DEFINE_PER_CPU(struct fast_pool, irq_randomness); #ifdef CONFIG_SMP -/* - * This function is called by the cpuhp system, wired up via the large - * static array in kernel/cpu.c, with the entry CPUHP_AP_RANDOM_ONLINE. - */ int random_online_cpu(unsigned int cpu) { /* - * Set irq randomness count to zero so that new accumulated - * irqs are fresh, and more importantly, so that its worker - * is permitted to schedule again when it comes back online, - * since the MIX_INFLIGHT flag will be cleared. + * This function is invoked while the CPU is comming up, after workqueue + * subsystem has been fully initialized for this CPU. + * + * During CPU bring up and shutdown way may schedule a worker + * (and set MIX_INFLIGHT) which will run another CPU because workqueues + * were not yet ready for this CPU. In this case the worker will + * recognize that it runs on the wrong CPU and do nothing. + * Therefore count is set unconditionally to zero which will permit + * to schedule a new worker soon. */ per_cpu_ptr(&irq_randomness, cpu)->count = 0; return 0; > --- a/drivers/char/random.c > +++ b/drivers/char/random.c > @@ -697,6 +697,25 @@ u32 get_random_u32(void) > } > EXPORT_SYMBOL(get_random_u32); > > +#ifdef CONFIG_SMP > +/* > + * This function is called by the cpuhp system, wired up via the large > + * static array in kernel/cpu.c, with the entry CPUHP_RANDOM_PREPARE. > + */ > +int random_prepare_cpu(unsigned int cpu) > +{ > + /* > + * When the cpu comes back online, immediately invalidate both > + * the per-cpu crng and all batches, so that we serve fresh > + * randomness. > + */ > + per_cpu_ptr(&crngs, cpu)->generation = ULONG_MAX; > + per_cpu_ptr(&batched_entropy_u32, cpu)->position = UINT_MAX; > + per_cpu_ptr(&batched_entropy_u64, cpu)->position = UINT_MAX; This runs before the CPU is up. Could you do the initialisation right now? My problem here is that if this (get_random_u32()) is used between CPUHP_AP_IDLE_DEAD and CPUHP_TEARDOWN_CPU then the initialisation will happen on the target CPU with disabled interrupts. And will acquire a sleeping lock (batched_entropy_u32.lock). You could perform the initialization cross CPU without the lock because the CPU itself isn't ready yet. Something like batch = per_cpu_ptr(&batched_entropy_u32, cpu); _get_random_bytes(batch->entropy_u32, sizeof(batch->entropy_u32)); batch->position = 0; batch->generation = next_gen; should do the job. Plus u64 and I haven't figured out the generation thingy but I suspect that a sleeping lock is involved. I did not check if this is a problem on PREEMPT_RT yet but it may become later one (if we get a user in the hotplug path). Sebastian