On Tue, Sep 10, 2019 at 12:33:12PM +0100, Linus Torvalds wrote: > On Tue, Sep 10, 2019 at 5:21 AM Ahmed S. Darwish <darwish.07@xxxxxxxxx> wrote: > > > > The commit b03755ad6f33 (ext4: make __ext4_get_inode_loc plug), [1] > > which was merged in v5.3-rc1, *always* leads to a blocked boot on my > > system due to low entropy. > > Exactly what is it that blocks on entropy? Nobody should do that > during boot, because on some systems entropy is really really low > (think flash memory with polling IO etc). > Ok, I've tracked it down further. It's unfortunately GDM intentionally blocking on a getrandom(buf, 16, 0). Booting the system with an straced GDM service ("ExecStart=strace -f /usr/bin/gdm") reveals: ... [ 3.779375] strace[262]: [pid 323] execve("/usr/lib/gnome-session-binary", ... /* 28 vars */) = 0 ... [ 4.019227] strace[262]: [pid 323] getrandom( <unfinished ...> [ 79.601433] kernel: random: crng init done [ 79.601443] kernel: random: 3 urandom warning(s) missed due to ratelimiting [ 79.601262] strace[262]: [pid 323] <... getrandom resumed>..., 16, 0) = 16 [ 79.601262] strace[262]: [pid 323] getrandom(..., 16, 0) = 16 [ 79.603041] strace[262]: [pid 323] getrandom(..., 16, 0) = 16 [ 79.603041] strace[262]: [pid 323] getrandom(..., 16, 0) = 16 [ 79.603041] strace[262]: [pid 323] getrandom(..., 16, 0) = 16 As can be seen in the timestamps, the GDM boot was only continued by typing randomly on the keyboard.. > That said, I would have expected that any PC gets plenty of entropy. > Are you sure it's entropy that is blocking, and not perhaps some odd > "forgot to unplug" situation? > Yes, doing any of below steps makes the problem reliably disappear: - boot param "random.trust_cpu=on" - rngd(8) enabled at boot (entropy source: x86 RDRAND + jitter) - pressing random 3 or 4 keyboard keys while GDM boot is stuck > > Can this even be considered a user-space breakage? I'm honestly not > > sure. On my modern RDRAND-capable x86, just running rng-tools rngd(8) > > early-on fixes the problem. I'm not sure about the status of older > > CPUs though. > > It's definitely breakage, although rather odd. I would have expected > us to have other sources of entropy than just the disk. Did we stop > doing low bits of TSC from timer interrupts etc? > Exactly. While gnome-session is obviously at fault here by requiring *blocking* randomness at the boot path, it's still not requesting much, just (5 * 16) bytes to be exact. I guess an x86 laptop should be able to provide that, even without RDRAND / random.trust_cpu=on (TSC jitter, etc.) ? thanks, --darwi > Ted, either way - ext4 IO patterns or random number entropy - this is > your code. Comments? > > Linus