On Mon, Dec 20, 2021 at 04:07:28PM +0100, Jason A. Donenfeld wrote: > Hi Eric, > > This patch seems fine to me, and I'll apply it in a few days after > sitting on the list for comments, but: > > > Note: READ_ONCE() could be used instead of smp_load_acquire(), but it is > > harder to verify that it is correct, so I'd prefer not to use it here. > > (https://lore.kernel.org/lkml/20200916233042.51634-1-ebiggers@xxxxxxxxxx/T/#u), > > and though it's a correct fix, it was derailed by a debate about whether > > it's safe to use READ_ONCE() instead of smp_load_acquire() or not. > > But holy smokes... I chuckled at your, "please explain in English." :) > > Paul - if you'd like to look at this patch and confirm that this > specific patch and usage is fine to be changed into READ_ONCE() > instead of smp_load_acquire(), please pipe up here. And I really do > mean this specific patch and usage, not to be confused with any other > usage elsewhere in the kernel or question about general things, which > doubtlessly involve larger discussions like the one Eric linked to > above. If you're certain this patch here is READ_ONCE()able, I'd > appreciate your saying so with a simple, "it is safe; go for it", > since I'd definitely like the optimization if it's safe. If I don't > hear from you, I'll apply this as-is from Eric, as I'd rather be safe > than sorry. First I would want to see some evidence that READ_ONCE() was really providing measurable performance benefit. Such evidence would be easiest to obtain by running on a weakly ordered system such as ARM, ARMv8, or PowerPC. If this does provide a measurable benefit, why not the following? static inline struct crng_state *select_crng(void) { struct crng_state **pool; struct crng_state *pooln; int nid = numa_node_id(); /* pairs with cmpxchg_release() in do_numa_crng_init() */ pool = rcu_dereference(&crng_node_pool); if (pool) { pooln = rcu_dereference(pool[nid]); if (pooln) return pooln; } return &primary_crng; } This is in ignorance of the kfree() side of this code. So another question is "Suppose that there was a long delay (vCPU preemption, for example) just before the 'return pooln'. What prevents a use-after-free bug?" Of course, this question applies equally to the smp_load_acquire() version. Thanx, Paul