(Apoogies for preceding mangled mail. I lost my connection to my mail host and it has a bad habit of sending the current edit buffer when I disconnect unexpectedy.) On Wed, 3 Apr 2019 at 12:51:48 +0200, Harald Freudenberger wrote > Then someone explained to me that a sha256 can never produce 256 bits of > entropy as there may exist collisions. Someone must assume that the output > of sha256 will have 255 bits entropy at most. However, I decided to double > all the buffer and use sha512 to be on the save side and be able to hold the > statement about the 256 bits entropy within the 32 bytes of random produced. I just spent a couple of hours writing a long explanation of the statistics of random functions and the effect of collisions on entropy before I started wondering what the seed material was actually used for and traced the code back through dz9zr010.pdf, which told me this whole thing is just an implementation of NIST SP800-90A! (Specifically, it's an implementation of Hash_DRBG using SHA-512 as a base.) That changes lots of things. There's no upper limit on seed size (okay, 2^35 bytes), but 256 is a desired *lower* limit on seed entropy. Things make a lot more sense now. Whenever you want n bits of entropy, you want to avoid bottlenecks. There's a wide uncertainty in how much entropy is in seed material. You try for a conservative lower bound, but what you actually care about is an attacker's uncertainty about the key material, and that's subjective. So it's actually an important design criterion to allow lots of headroom in entropy buffer sizes. Don't even *try* to store n bits of entropy in an n-bit buffer, but allow at least a 2:1 margin. This can be seen in the Hash_DRBG design. It wants 256 bits of entropy, but uses an 888-bit "seed length" internally. (It actually uses two such variables, V and C, but you can see that C is regenerated from V each seeding, so there's an 888-bit botteneck.) So my initial opinion is that you should just take the entire 8K of timestamps *plus* the get_random_bytes output and use that for seed material. It'll get hashed twice to generate the internal state V, but that's no biggie. However! The z/Arch PPNO instruction has a limit of 512 bytes on the seed material, so we can't do that. Some manual compression is required. Your idea of using SHA-512 is fine. The two main things the current code does wrong are: - There's no reason to compress things by XORing get_random_bytes() and SHA-512 output together. Just concatenate them in the seed buffer. - Calling generate_entropy with a length not a multiple of 64 bytes is an actively bad thing. You're generating seed timestamps with *at least* 256 bits of entropy, then (in prng_sha512_instantiate()) only using *at most* 256 bits of that. That's just wasteful. Use the whole 512 bytes of SHA-512 output you just computed. What you should do to instantiate is: - Allocate a 64 + 64 + 48-byte buffer (512 + 512 + 384 bits). - Generate a bunch of timestamps and hash them to generate the first 512 bits. - Do it again for the second bunch. (If you want only 384 bits of seed, generate fewer timestamps, but don't use less hash output.) - Call get_random_bytes() for the last 48 bytes. (Or whatever you think a reasonable security parameter is. 32 bytes is fine, too, but there's no harm in a little more *because the DRBG has enough state space to store the excess*.) - Feed the entire 176-byte buffer to CPACF_PRNO_SHA512_DRNG_SEED. There are other slight permutations of this, but that's the basic idea. It would actually be better to generate all the timestamps in one big buffer and hash them twice (with different starting state values; the output of pass #1 will do fine for the start of pass #2) rather than independent passes, but the buffer is already inconveniently large. But! You could get creative with KIMD and maintain two states (you have the buffer space for it, after all), generate timestamps a page at a time, and hash the pages one at a time. I don't know the startup cost of KIMD and how much you save by providing a contiguous buffer. One thing I note is that you don't bother initializing the hash[] array in generate_entropy() before using it as the SHA-512 IV. Not bad, but it deserves a comment. As does the fact that you don't finalize the hash.