On Sun, Jul 31, 2022 at 01:45:43AM +0200, Jason A. Donenfeld wrote: > So, anyway, if I do muster a v2 of this (perhaps just to see the idea > through), the API might split in two to something like: > > void *getrandom_allocate_states([inout] size_t *number_of_states, [out] size_t *length_per_state); > ssize_t getrandom(void *state, void *buffer, size_t len, unsigned long flags); > > User code will call getrandom_allocate_state(), which will allocate > enough pages to hold *number_of_states, and return the size of each one > in length_per_state and the number actually allocated back in > number_of_states. The result can then be sliced up by that size, and > passed to getrandom(). So glibc or whatever would presumably allocate > one per thread, and handle any reentrancy/locking around it. > > Or some other variation on that. I'm sure you hate those function > signatures. Everybody loves to bikeshed APIs, right? There's plenty to > be tweaked. But that's anyhow about where my thinking is for a potential > v2. Doing this also doubled performance, perhaps unsurprisingly, as that getcpu() operation wasn't free. For uint32_t generation: vdso: 25000000 times in 0.289876265 seconds syscall: 25000000 times in 4.296636025 seconds