Re: kernel 6.2 stuck at boot (efi_call_rts) on arm64

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Mar 18, 2023 at 11:35:44AM +0100, Ard Biesheuvel wrote:
> On Thu, 16 Mar 2023 at 23:28, Darren Hart <darren@xxxxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Thu, Mar 16, 2023 at 07:55:36PM +0100, Ard Biesheuvel wrote:
> > > On Thu, 16 Mar 2023 at 18:52, Andrea Righi <andrea.righi@xxxxxxxxxxxxx> wrote:
> ...
> > > >
> > > > Yay! Success! I just tested your latest efi/urgent (with the fixup) and
> > > > system completed the boot without any soft lockups.
> > > >
> > >
> > > Thanks for confirming. I'll take that as a tested-by
> >
> > The solution in the current branch looks like the best approach we have to date
> > to address the broadest of affected systems. We could switch the eMAG test to an
> > MIDR test I believe (but this won't work for Altra as that would capture all the
> > Neoverse v1 cores beyond Altra). I can look into the MIDR test if you think it's
> > worthwhile - but since I don't think we can eliminate the SMBIOS string test, it
> > doesn't buy us much since we don't need a greedier eMAG test (there aren't more
> > of them to match).
> >
> > Given that some OEM Altra platforms change the processor ID, I don't see a
> > better solution currently than adding their the "product name" to the smbios
> > string tests unfortunately.
> >
> 
> Indeed. I spotted a Gigabyte system [0] with a different processor ID,
> but with a version we can test for.
> 
> So for now, I'll go with
> 
>         socid = (u32 *)record->processor_id;
>         switch (*socid & 0xffff000f) {
>                 static char const altra[] = "Ampere(TM) Altra(TM) Processor";
>                 static char const emag[] = "eMAG";
>         default:
>                 version = efi_get_smbios_string(&record->header, 4,
>                                                 processor_version);
>                 if (!version || (strncmp(version, altra, sizeof(altra) - 1) &&
>                                  strncmp(version, emag, sizeof(emag) - 1)))
>                         break;
> 
>                 fallthrough;
> 
>         case 0x0a160001:        // Altra
>         case 0x0a160002:        // Altra Max
>                 efi_warn("Working around broken SetVirtualAddressMap()\n");
> ...
> 
> which should cover all the affected systems we encountered so far.
> 
> I'll push this to linux-next to let it soak for a little bit, and then
> send it to Linus somewhere during the week
> 
> Thanks,
> Ard.
> 
> 
> [0] https://pastebin.com/HQLE1yYv

Not sure if it's a similar issue, but I have found another Ampere box
that is booting fine with your fixes, but the eifvars.sh kselftest is
failing with some I/O errors, specifically:

$ sudo ./efivarfs.sh
--------------------
running test_create
--------------------
./efivarfs.sh: line 58: printf: write error: Input/output error
/sys/firmware/efi/efivars/test_create-210be57c-9849-4fc7-a635-e6382d1aec27 has invalid size
  [FAIL]
--------------------
running test_create_empty
--------------------
  [PASS]
--------------------
running test_create_read
--------------------
  [PASS]
--------------------
running test_delete
--------------------
./efivarfs.sh: line 103: printf: write error: Input/output error
  [PASS]
--------------------
running test_zero_size_delete
--------------------
./efivarfs.sh: line 126: printf: write error: Input/output error
./efivarfs.sh: line 134: printf: write error: Input/output error
/sys/firmware/efi/efivars/test_zero_size_delete-210be57c-9849-4fc7-a635-e6382d1aec27 should have been deleted
  [FAIL]
--------------------
running test_open_unlink
--------------------
open(O_WRONLY): Operation not permitted
  [FAIL]
--------------------
running test_valid_filenames
--------------------
./efivarfs.sh: line 158: printf: write error: Input/output error
./efivarfs.sh: line 158: printf: write error: Input/output error
./efivarfs.sh: line 158: printf: write error: Input/output error
./efivarfs.sh: line 158: printf: write error: Input/output error
  [PASS]
--------------------
running test_invalid_filenames
--------------------
  [PASS]

If it helps:

$ sudo hexdump -C /sys/firmware/dmi/entries/4-0/raw
00000000  04 30 04 00 01 03 fe 02  c1 d0 3f 41 00 00 00 00  |.0........?A....|
00000010  03 8a 72 06 b8 0b f0 0a  41 06 05 00 06 00 07 00  |..r.....A.......|
00000020  04 05 06 50 50 50 04 00  01 01 01 00 01 00 01 00  |...PPP..........|
00000030  43 50 55 20 31 00 41 6d  70 65 72 65 28 52 29 00  |CPU 1.Ampere(R).|
00000040  41 6d 70 65 72 65 28 52  29 20 41 6c 74 72 61 28  |Ampere(R) Altra(|
00000050  52 29 20 50 72 6f 63 65  73 73 6f 72 00 30 30 30  |R) Processor.000|
00000060  30 30 30 30 30 30 30 30  30 30 30 30 30 30 32 35  |0000000000000025|
00000070  35 30 32 30 39 30 33 33  38 36 35 42 34 00 30 30  |50209033865B4.00|
00000080  30 30 30 30 30 31 00 51  38 30 2d 33 30 00 00     |000001.Q80-30..|
0000008f

I guess EFI is not very reliable here...

-Andrea



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux