On Wed, Sep 11, 2019 at 05:45:38PM +0100, Linus Torvalds wrote: > On Wed, Sep 11, 2019 at 5:07 PM Theodore Y. Ts'o <tytso@xxxxxxx> wrote: > > > > > > Ted, comments? I'd hate to revert the ext4 thing just because it > > > happens to expose a bad thing in user space. > > > > Unfortuantely, I very much doubt this is going to work. That's > > because the add_disk_randomness() path is only used for legacy > > /dev/random [...] > > > > Also, because by default, the vast majority of disks have > > /sys/block/XXX/queue/add_random set to zero by default. > > Gaah. I was looking at the input randomness, since I thought that was > where the added randomness that Ahmed got things to work with came > from. > > And that then made me just look at the legacy disk randomness (for the > obvious disk IO reasons) and I didn't look further. > Yup, I confirm that the quick patch kept the situation as-is. I was going to debug why, but now we know the answer.. > > So the the way we get entropy these days for initializing the CRNG is > > via the add_interrupt_randomness() path, where do something really > > fast, and we assume that we get enough uncertainity from 8 interrupts > > to give us one bit of entropy (64 interrupts to give us a byte of > > entropy), and that we need 512 bits of entropy to consider the CRNG > > fully initialized. (Yeah, there's a lot of conservatism in those > > estimates, and so what we could do is decide to say, cut down the > > number of bits needed to initialize the CRNG to be 256 bits, since > > that's the size of the CHACHA20 cipher.) > > So that's 4k interrupts if I counted right, and yeah, maybe Ahmed was > just close enough before, and the merging of the inode table IO then > took him below that limit. > > > Ultimately, though, we need to find *some* way to fix userspace's > > assumptions that they can always get high quality entropy in early > > boot, or we need to get over people's distrust of Intel and RDRAND. > > Well, even on a PC, sometimes rdrand just isn't there. AMD has screwed > it up a few times, and older Intel chips just don't have it. > > So I'd be inclined to either lower the limit regardless - ACK :) > and perhaps make the "user space asked for randomness much too > early" be a big *warning* instead of being a basically fatal hung > machine? Hmmm, regarding "randomness request much too early", how much is time really a factor here? I tested leaving the machine even for 15+ minutes, and it still didn't continue booting: the boot is practically blocked forever... Or is the thoery that hopefully once the machine is un-stuck, more sources of entropy will be available? If that's the case, then possibly (rate-limited): "urandom: process XX asked for YY bytes. CRNG not yet initialized" > Linus thanks, -- darwi http://darwish.chasingpointers.com