Re: [PATCH] efi/unaccepted: Use ACPI reclaim memory for unaccepted memory table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/16/23 14:05, Ard Biesheuvel wrote:
Kyril reports that crashkernels fail to work on confidential VMs that
rely on the unaccepted memory table, and this appears to be caused by
the fact that it is not considered part of the set of firmware tables
that the crashkernel needs to map.

This is an oversight, and a result of the use of the EFI_LOADER_DATA
memory type for this table. The correct memory type to use for any
firmware table is EFI_ACPI_RECLAIM_MEMORY (including ones created by the
EFI stub), even though the name suggests that is it specific to ACPI.
ACPI reclaim means that the memory is used by the firmware to expose
information to the operating system, but that the memory region has no
special significance to the firmware itself, and the OS is free to
reclaim the memory and use it as ordinary memory if it is not interested
in the contents, or if it has already consumed them. In Linux, this
memory is never reclaimed, but it is always covered by the kernel direct
map and generally made accessible as ordinary memory.

On x86, ACPI reclaim memory is translated into E820_ACPI, which the
kexec logic already recognizes as memory that the crashkernel may need
to to access, and so it will be mapped and accessible to the booting
crash kernel.

Reported-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
Signed-off-by: Ard Biesheuvel <ardb@xxxxxxxxxx>
---
  drivers/firmware/efi/libstub/unaccepted_memory.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/firmware/efi/libstub/unaccepted_memory.c b/drivers/firmware/efi/libstub/unaccepted_memory.c
index ca61f4733ea58693..9a655f30ba47db01 100644
--- a/drivers/firmware/efi/libstub/unaccepted_memory.c
+++ b/drivers/firmware/efi/libstub/unaccepted_memory.c
@@ -62,7 +62,7 @@ efi_status_t allocate_unaccepted_bitmap(__u32 nr_desc,
  	bitmap_size = DIV_ROUND_UP(unaccepted_end - unaccepted_start,
  				   EFI_UNACCEPTED_UNIT_SIZE * BITS_PER_BYTE);
- status = efi_bs_call(allocate_pool, EFI_LOADER_DATA,
+	status = efi_bs_call(allocate_pool, EFI_ACPI_RECLAIM_MEMORY,

I bisected an SNP guest crash when using the tip tree to this commit. When
the kernel switches over to the swapper_pg_dir in init_mem_mapping(), the
unaccepted table is no longer mapped. Here's a copy of the stack trace:

[    0.074233] *** DEBUG: accept_memory:36 - unaccepted=0xffff88807f77ef18
[    0.075805] BUG: unable to handle page fault for address: ffff88807f77ef1c
[    0.076541] #PF: supervisor read access in kernel mode
[    0.077089] #PF: error_code(0x0000) - not-present page
[    0.077631] PGD 8000004c01067 P4D 8000004c01067 PUD 800017fffe067 PMD 8000004c04067 PTE 0
[    0.078498] Oops: 0000 [#1] PREEMPT SMP NOPTI
[    0.078967] CPU: 0 PID: 0 Comm: swapper Not tainted 6.6.0-rc2-sos-sev #12
[    0.079682] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 2/2/2022
[    0.080507] RIP: 0010:accept_memory+0x75/0x250
[    0.080984] Code: 74 4a 4c 89 e1 ba 24 00 00 00 48 c7 c6 68 9e 10 82 48 c7 c7 50 9e 53 82 e8 48 92 79 ff 4c 89 e7 e8 50 b5 6e ff ba 2c 00 00 00 <45> 8b 74 24 04 48 c7 c6 68 9e 10 82 48 c7 c7 3d 9e 53 82 e8 23 92
[    0.082969] RSP: 0000:ffffffff82803df8 EFLAGS: 00010046
[    0.083518] RAX: 0000000000000000 RBX: 000000017fffd000 RCX: 0000000000000000
[    0.084271] RDX: 000000000000002c RSI: ffffffff82803cc0 RDI: 00000000ffffffff
[    0.085029] RBP: 0000000000000000 R08: 00000000ffff7fff R09: 0000000000000001
[    0.085786] R10: 00000000ffff7fff R11: ffffffff8286f9e0 R12: ffff88807f77ef18
[    0.086537] R13: 000000017ffd2000 R14: 0000000000000000 R15: 000000017ffd2000
[    0.087293] FS:  0000000000000000(0000) GS:ffffffff83273000(0000) knlGS:0000000000000000
[    0.088154] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.088764] CR2: ffff88807f77ef1c CR3: 000800000382e000 CR4: 00000000000000f0
[    0.089520] Call Trace:
[    0.089783]  <TASK>
[    0.090005]  ? __die+0x1f/0x70
[    0.090334]  ? page_fault_oops+0x81/0x150
[    0.090780]  ? kernelmode_fixup_or_oops+0x84/0x110
[    0.091287]  ? exc_page_fault+0xa8/0x150
[    0.091705]  ? asm_exc_page_fault+0x22/0x30
[    0.092161]  ? accept_memory+0x75/0x250
[    0.092564]  ? accept_memory+0x70/0x250
[    0.092971]  ? memblock_alloc_range_nid+0xf4/0x160
[    0.093479]  ? numa_register_memblks.constprop.0+0x286/0x3a0
[    0.094079]  ? __pfx_dummy_numa_init+0x10/0x10
[    0.094548]  ? numa_init+0x102/0x2a0
[    0.094929]  ? setup_arch+0xc58/0x1010
[    0.095326]  ? start_kernel+0x5e/0x5e0
[    0.095728]  ? x86_64_start_reservations+0x14/0x30
[    0.096235]  ? x86_64_start_kernel+0x79/0x80
[    0.096688]  ? secondary_startup_64_no_verify+0x16b/0x16b
[    0.097269]  </TASK>
[    0.097499] Modules linked in:
[    0.097826] CR2: ffff88807f77ef1c
[    0.098178] ---[ end trace 0000000000000000 ]---

Thanks,
Tom

  			     sizeof(*unaccepted_table) + bitmap_size,
  			     (void **)&unaccepted_table);
  	if (status != EFI_SUCCESS) {



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux