On Tue, May 16, 2023 at 05:41:55PM -0500, Tom Lendacky wrote: > On 5/13/23 17:04, Kirill A. Shutemov wrote: > > UEFI Specification version 2.9 introduces the concept of memory > > acceptance: some Virtual Machine platforms, such as Intel TDX or AMD > > SEV-SNP, requiring memory to be accepted before it can be used by the > > guest. Accepting happens via a protocol specific for the Virtual > > Machine platform. > > > > Accepting memory is costly and it makes VMM allocate memory for the > > accepted guest physical address range. It's better to postpone memory > > acceptance until memory is needed. It lowers boot time and reduces > > memory overhead. > > > > The kernel needs to know what memory has been accepted. Firmware > > communicates this information via memory map: a new memory type -- > > EFI_UNACCEPTED_MEMORY -- indicates such memory. > > > > Range-based tracking works fine for firmware, but it gets bulky for > > the kernel: e820 has to be modified on every page acceptance. It leads > > to table fragmentation, but there's a limited number of entries in the > > e820 table > > > > Another option is to mark such memory as usable in e820 and track if the > > range has been accepted in a bitmap. One bit in the bitmap represents > > 2MiB in the address space: one 4k page is enough to track 64GiB or > > physical address space. > > > > In the worst-case scenario -- a huge hole in the middle of the > > address space -- It needs 256MiB to handle 4PiB of the address > > space. > > > > Any unaccepted memory that is not aligned to 2M gets accepted upfront. > > > > The approach lowers boot time substantially. Boot to shell is ~2.5x > > faster for 4G TDX VM and ~4x faster for 64G. > > > > TDX-specific code isolated from the core of unaccepted memory support. It > > supposed to help to plug-in different implementation of unaccepted memory > > such as SEV-SNP. > > > > -- Fragmentation study -- > > > > Vlastimil and Mel were concern about effect of unaccepted memory on > > fragmentation prevention measures in page allocator. I tried to evaluate > > it, but it is tricky. As suggested I tried to run multiple parallel kernel > > builds and follow how often kmem:mm_page_alloc_extfrag gets hit. > > > > See results in the v9 of the patchset[1][2] > > > > [1] https://lore.kernel.org/all/20230330114956.20342-1-kirill.shutemov@xxxxxxxxxxxxxxx > > [2] https://lore.kernel.org/all/20230416191940.ex7ao43pmrjhru2p@xxxxxxxxxxxxxxxxx > > > > -- > > > > The tree can be found here: > > > > https://github.com/intel/tdx.git guest-unaccepted-memory > > I get some failures when building without TDX support selected in my > kernel config after adding unaccepted memory support for SNP: > > In file included from arch/x86/boot/compressed/../../coco/tdx/tdx-shared.c:1, > from arch/x86/boot/compressed/tdx-shared.c:2: > ./arch/x86/include/asm/tdx.h: In function ‘tdx_kvm_hypercall’: > ./arch/x86/include/asm/tdx.h:72:17: error: ‘ENODEV’ undeclared (first use in this function) > 72 | return -ENODEV; > | ^~~~~~ > ./arch/x86/include/asm/tdx.h:72:17: note: each undeclared identifier is reported only once for each function it appears in > > Adding an include for linux/errno.h gets past that error, but then > I get the following: > > ld: arch/x86/boot/compressed/tdx-shared.o: in function `tdx_enc_status_changed_phys': > tdx-shared.c:(.text+0x42): undefined reference to `__tdx_hypercall' > ld: tdx-shared.c:(.text+0x7f): undefined reference to `__tdx_module_call' > ld: tdx-shared.c:(.text+0xce): undefined reference to `__tdx_module_call' > ld: tdx-shared.c:(.text+0x13b): undefined reference to `__tdx_module_call' > ld: tdx-shared.c:(.text+0x153): undefined reference to `cc_mkdec' > ld: tdx-shared.c:(.text+0x15d): undefined reference to `cc_mkdec' > ld: tdx-shared.c:(.text+0x18e): undefined reference to `__tdx_hypercall' > ld: arch/x86/boot/compressed/vmlinux: hidden symbol `__tdx_hypercall' isn't defined > ld: final link failed: bad value > > So it looks like arch/x86/boot/compressed/tdx-shared.c is being > built, while arch/x86/boot/compressed/tdx.c isn't. Right. I think this should help: diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile index 78f67e0a2666..b13a58021086 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -106,8 +106,8 @@ ifdef CONFIG_X86_64 endif vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o -vmlinux-objs-$(CONFIG_INTEL_TDX_GUEST) += $(obj)/tdx.o $(obj)/tdcall.o -vmlinux-objs-$(CONFIG_UNACCEPTED_MEMORY) += $(obj)/mem.o $(obj)/tdx-shared.o +vmlinux-objs-$(CONFIG_INTEL_TDX_GUEST) += $(obj)/tdx.o $(obj)/tdcall.o $(obj)/tdx-shared.o +vmlinux-objs-$(CONFIG_UNACCEPTED_MEMORY) += $(obj)/mem.o vmlinux-objs-$(CONFIG_EFI) += $(obj)/efi.o vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_mixed.o > After setting TDX in the kernel config, I can build successfully, but > I'm running into an error when trying to accept memory during > decompression. > > In drivers/firmware/efi/libstub/unaccepted_memory.c, I can see that the > unaccepted_table is allocated, but when accept_memory() is invoked the > table address is now zero. I thought maybe it had to do with bss, but even > putting it in the .data section didn't help. I'll keep digging, but if you > have any ideas, that would be great. Not right away. But maybe seeing your side of enabling would help. -- Kiryl Shutsemau / Kirill A. Shutemov