Re: [PATCH 3/3] efi/memattr: Include EFI memmap size in corruption warnings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 7 Jan 2025 at 13:05, Breno Leitao <leitao@xxxxxxxxxx> wrote:
>
> Hello Ard,
>
> On Tue, Jan 07, 2025 at 12:24:03PM +0100, Ard Biesheuvel wrote:
> > On Mon, 6 Jan 2025 at 20:03, Breno Leitao <leitao@xxxxxxxxxx> wrote:
> > >
> > > Add EFI memory map size to warning messages when a corrupted Memory
> > > Attributes Table is detected, making it easier to diagnose firmware issues.
> > >
> > > Signed-off-by: Breno Leitao <leitao@xxxxxxxxxx>
> > > ---
> > >  drivers/firmware/efi/memattr.c | 9 +++++----
> > >  1 file changed, 5 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/drivers/firmware/efi/memattr.c b/drivers/firmware/efi/memattr.c
> > > index 5f83cdea88b05cb325e9f90c14a0048131e53cfa..2c276bcc0df48352bec6cd96b69edf67a16f6069 100644
> > > --- a/drivers/firmware/efi/memattr.c
> > > +++ b/drivers/firmware/efi/memattr.c
> > > @@ -22,7 +22,7 @@ unsigned long __ro_after_init efi_mem_attr_table = EFI_INVALID_TABLE_ADDR;
> > >  void __init efi_memattr_init(void)
> > >  {
> > >         efi_memory_attributes_table_t *tbl;
> > > -       unsigned long size;
> > > +       unsigned long size, efi_map_size;
> > >
> > >         if (efi_mem_attr_table == EFI_INVALID_TABLE_ADDR)
> > >                 return;
> > > @@ -49,9 +49,10 @@ void __init efi_memattr_init(void)
> > >          * just be ignored altogether.
> > >          */
> > >         size = tbl->num_entries * tbl->desc_size;
> > > -       if (size > 3 * efi.memmap.nr_map * efi.memmap.desc_size) {
> > > -               pr_warn(FW_BUG "Corrupted EFI Memory Attributes Table detected! (version == %u, desc_size == %u, num_entries == %u)\n",
> > > -                       tbl->version, tbl->desc_size, tbl->num_entries);
> > > +       efi_map_size = efi.memmap.nr_map * efi.memmap.desc_size;
> > > +       if (size > 3 * efi_map_size) {
> > > +               pr_warn(FW_BUG "Corrupted EFI Memory Attributes Table detected! (version == %u, desc_size == %u, num_entries == %u, efi_map_size == %lu)\n",
> > > +                       tbl->version, tbl->desc_size, tbl->num_entries, efi_map_size);
> > >                 goto unmap;
> > >         }
> > >
> > >
> >
> > Hello Breno,
> >
> > I don't mind the patch per se, but I don't think it is terribly useful either.
> >
> > Could you explain how this helps?
>
> We are seeing a bunch of `Corrupted EFI Memory Attributes Table
> detected!` in the Meta fleet, and this is something we are
> investigating.
>
> We highly think this is related to some kexec overwrites, and when we
> get here, the EFI table is completely garbage. I haven't seen this
> problem on cold boot.
>

It likely means the memory is not reserved correctly.

Could you check whether this

--- a/drivers/firmware/efi/memattr.c
+++ b/drivers/firmware/efi/memattr.c
@@ -56,7 +56,7 @@ int __init efi_memattr_init(void)
        }

        tbl_size = sizeof(*tbl) + size;
-       memblock_reserve(efi_mem_attr_table, tbl_size);
+       efi_mem_reserve(efi_mem_attr_table, tbl_size);
        set_bit(EFI_MEM_ATTR, &efi.flags);

 unmap:


makes any difference?


> Here are sof the instances I see:
>
>         efi: memattr: [Firmware Bug]: Corrupted EFI Memory Attributes Table detected! (version == 0, desc_size == 18058, num_entries == 33554432)
>         efi: memattr: [Firmware Bug]: Corrupted EFI Memory Attributes Table detected! (version == 1, desc_size == 2072184435, num_entries == 3248688968)
>         efi: memattr: [Firmware Bug]: Corrupted EFI Memory Attributes Table detected! (version == 0, desc_size == 83886080, num_entries == 304)
>         efi: memattr: [Firmware Bug]: Corrupted EFI Memory Attributes Table detected! (version == 2, desc_size == 48, num_entries == 40)
>

The last one looks like a false positive: each of those values seems
perfectly reasonable.

Any chance you could dump the memory map and this table (boot using
efi=debug) on this system?


> Anyway, back to you question, this patch helped us to narrow down and
> find where the problem was, by printing all variables taken in
> consideration to get the conclusion that the firmware is buggy.
>

Fair enough.

> Regarding the problem, Usama and I are suspecting that it might be
> related to some 77d48d39e99170 ("efistub/tpm: Use ACPI reclaim memory for
> event log to avoid corruption"), but at this time with memattr table, where it
> might not preserved during kexec(?).
>

Please see the suggestion above.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux