Hey Linus, Thanks a bunch for chiming in. Indeed this whole thing is kind of crazy, so your input is particularly useful here. On Sat, Jul 30, 2022 at 08:48:42AM -0700, Linus Torvalds wrote: > It's just too specialized, and the people who care about performance > can - and do - do special things anyway. I followed most of your email, but I just wanted to point out that the "can" part of this isn't quite right, though the "do" part is. Specifically, I don't think there's currently a good way for userspace to do this kind of thing and get the same kind of security guarantees that the syscall has. They "do" it anyway, though (openssl, libgcrypt, glibc's arc4random() implementation before I tamed it last week, etc), and this is somewhat concerning. So my larger intent is, assuming that people will continue to attempt such things, to just nip the issue in the bud by providing an actually safe way for it to be done. To be clear, I really would rather not do this. I'm not really looking for more stuff to do, and I don't tend to write (public) code "just 'cuz". My worry is that by /not/ doing it, footguns will proliferate. The glibc thing was what finally motivated me to want to at least sketch out a potential action to make this kind of (apparently common) urge of writing a userspace RNG safer. (Actually coding it up didn't really take much time, which perhaps shows: that `if (!len)` check needs to be hoisted out of the inner block!) > So I'm really not convinced that this kind of thing is something the > kernel should work that hard to help. > > Your patch fundamentally seems to be about "make it easy to not have > to care, and still get high performance", but that's _such_ a small > use-case (the intersection between "don't care" and "really really > need high performance" would seem to be basically NIL). So this is "statement (1)" stuff. Namely, userspace apparently wants faster random numbers. Is this desire justified? Has anybody aside from Phoronix even benchmarked getrandom() since I did the neat lockless stuff to it? Is this just for some artificial card shuffling unit tests, or is generating TLS CBC nonces at scale using getrandom() a real bottleneck for a real use case? I'm honestly not quite sure. But I do know that people are building these userspace RNGs anyway, and will keep building them, and that kind of worries me. So either this is a useful thing to have, and people are building it anyway, so maybe the kernel should get involved. Or it's not a useful thing to have, BUT people are building it anyway, so maybe the kernel should [not?] get involved? The latter case is a bit decisionally hairier. Anyway, onto the technical feedback: > And that state allocation in particular looks very random in all the > wrong ways, with various "if I run out of resources I'll just do a > system call" things. > > Not to mention that I don't think your patch can work anyway, with > things like "cmpxchg()" not being something you can do in the vdso > because it might have the kernel instrumentation in it. Yea this sharding thing is somewhat faulty. In its current inception, it also falls over during fork, since the cmpxchg pseudo trylock is dropped, among other problems Andy and I discussed on IRC. Andy also suggested not doing the allocation inside of the same function. Florian brought up the difficulty of even determining the CPU number on arm64. And also that's a good point about instrumentation on cmpxchg. So, anyway, if I do muster a v2 of this (perhaps just to see the idea through), the API might split in two to something like: void *getrandom_allocate_states([inout] size_t *number_of_states, [out] size_t *length_per_state); ssize_t getrandom(void *state, void *buffer, size_t len, unsigned long flags); User code will call getrandom_allocate_state(), which will allocate enough pages to hold *number_of_states, and return the size of each one in length_per_state and the number actually allocated back in number_of_states. The result can then be sliced up by that size, and passed to getrandom(). So glibc or whatever would presumably allocate one per thread, and handle any reentrancy/locking around it. Or some other variation on that. I'm sure you hate those function signatures. Everybody loves to bikeshed APIs, right? There's plenty to be tweaked. But that's anyhow about where my thinking is for a potential v2. Jason