Hi Holger, On Fri, Jul 22, 2022 at 10:08:05AM +0200, Holger Dengler wrote: > Why not changing the API to take bytes instead of words? Sure, at the > moment it looks like all platforms with TRNG support are able to > deliver at least one word, but bytes would be more flexible. The idea is to strike a sweet spot between capabilities. S390x is fine with byte-level granularity up to arbitrary lengths, while x86 is best with word-level granularity of length 1. The happy intersection between the two is just word-level granularity of arbitrary length. Yes we _could_ introduce a lot of code complexity by cascading the x86 case down into smaller and smaller registers, ignoring the fact that it's no longer efficient below 32- or 64-bit registers depending on vendor. But then we're relying on the inliner to remove all of that extra code, since all callers actually only ever want 32 or 64 bytes. Why bloat for nothing? The beauty of this approach is that it translates very naturally over all the various quirks of architectures without having to have a lot of coupling code. The other reason is that it's simply not necessary. The primary use for this in random.c is to fill a 32- or 64-*byte* block with "some stuff", preferring RDSEED, then RDRAND, and finally falling back to RDTSC. These correspond with arch_get_random_seed_longs(), arch_get_random_longs(), and random_get_entropy() (which is usually get_cycles() underneath), respectively. With the cycle counter being (at least) ~word-sized on all platforms, keeping the granularity of the arch_get_random_*_longs() functions the same lets us fill these with a basic cascade that doesn't require a lot of code: unsigned long array[whatever]; for (i = 0; i < ARRAY_SIZE(array);) { longs = arch_get_random_seed_longs(&array[i], ARRAY_SIZE(array) - i); if (longs) { i += longs; continue; } longs = arch_get_random_longs(&array[i], ARRAY_SIZE(array) - i); if (longs) { i += longs; continue; } array[i++] = random_get_entropy(); } By using a word as the underlying unit, the above cascade generates optimal code on basically all archrandom platforms, no matter what their byte-vs-word or one-vs-three-vs-many semantics are. That's a bit long winded, but hopefully that gives a bit of insight on why going from _long -> _longs is so "lazy" looking. Jason