Re: [REGRESSION][BISECTED] Boot stall from merge tag 'net-next-6.2'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 21 Jun 2023 at 19:57, Jason A. Donenfeld <Jason@xxxxxxxxx> wrote:
>
> +Ard - any ideas here?
>
> On Wed, Jun 21, 2023 at 10:46 AM Linux regression tracking (Thorsten
> Leemhuis) <regressions@xxxxxxxxxxxxx> wrote:
> >
> > [added Jason (who authored the culprit) to the list of recipients; moved
> > net people and list to BCC, guess they are not much interested in this
> > anymore then]
> >
> > On 21.06.23 08:07, Sami Korkalainen wrote:
> > > I bisected again. It seems I made some mistake last time, as I got a
> > > different result this time. Maybe, because these problematic kernels may
> > > boot fine sometimes, like I said before.
> > >
> > > Anyway, first bad commit (makes much more sense this time):
> > > e7b813b32a42a3a6281a4fd9ae7700a0257c1d50 efi: random: refresh
> > > non-volatile random seed when RNG is initialized
> > >
> > > I confirmed that this is the code causing the issue by commenting it
> > > out (see the patch file). Without this code, the latest mainline boots fine.
> >
> > Jason, in that case it seems this is something for you. For the initial
> > report, see here:
> >
> > https://lore.kernel.org/all/GQUnKz2al3yke5mB2i1kp3SzNHjK8vi6KJEh7rnLrOQ24OrlljeCyeWveLW9pICEmB9Qc8PKdNt3w1t_g3-Uvxq1l8Wj67PpoMeWDoH8PKk=@proton.me/
> >
> > Quoting a part of it:
> >
> > ```
> > Linux 6.2 and newer are (mostly) unbootable on my old HP 6730b laptop,
> > the 6.1.30 works still fine.
> > The weirdest thing is that newer kernels (like 6.3.4 and 6.4-rc3) may
> > boot ok on the first try, but when rebooting, the very same version
> > doesn't boot.
> >
> > Some times, when trying to boot, I get this message repeated forever:
> > ACPI Error: No handler or method for GPE [XX], disabling event
> > (20221020/evgpe-839)
> > On newer kernels, the date is 20230331 instead of 20221020. There is
> > also some other error, but I can't read it as it gets overwritten by the
> > other ACPI error, see image linked at the end.
> >
> > And some times, the screen will just stay completely blank.
> >
> > I tried booting with acpi=off, but it does not help.

Catching up with email after my vacation, apologies for the delay.

This ship seems to have sailed in the meantime, but I'll contribute
some observations anyway.

The machine in question appears to be Vista-era Windows laptop, and I
am not surprised at all that the firmware is flaky. In those days,
firmware testing was limited to boot testing Windows, and nobody
bothered testing for EFI compliance beyond that (as it is not needed
to get the Windows sticker)

However, the failure mode still strikes me as odd, and I'd be
interested in finding out whether booting with efi=noruntime makes a
difference at all, as that would prevent the SetVariable() all from
taking place, without affecting anything else.

Setting the variable from user space is ultimately a better choice, I
think. The reason it was avoided it here is so that we don't have to
rely on user space to set limited permissions on the efivarfs file
entry in order to avoid the seed from being world readable (which is
something, e.g., systemd does today for other 'sensitive' EFI
variables, whatever that means). But given that this variable is in
its own GUIDed namespace, we could easily fix that in efivarfs itself.




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux