Kai Huang wrote: > TDX reports a list of "Convertible Memory Region" (CMR) to indicate all > memory regions that can possibly be used by the TDX module, but they are > not automatically usable to the TDX module. As a step of initializing > the TDX module, the kernel needs to choose a list of memory regions (out > from convertible memory regions) that the TDX module can use and pass > those regions to the TDX module. Once this is done, those "TDX-usable" > memory regions are fixed during module's lifetime. No more TDX-usable > memory can be added to the TDX module after that. > > The initial support of TDX guests will only allocate TDX guest memory > from the global page allocator. To keep things simple, this initial > implementation simply guarantees all pages in the page allocator are TDX > memory. To achieve this, use all system memory in the core-mm at the > time of initializing the TDX module as TDX memory, and at the meantime, > refuse to add any non-TDX-memory in the memory hotplug. > > Specifically, walk through all memory regions managed by memblock and > add them to a global list of "TDX-usable" memory regions, which is a > fixed list after the module initialization (or empty if initialization > fails). To reject non-TDX-memory in memory hotplug, add an additional > check in arch_add_memory() to check whether the new region is covered by > any region in the "TDX-usable" memory region list. > > Note this requires all memory regions in memblock are TDX convertible > memory when initializing the TDX module. This is true in practice if no > new memory has been hot-added before initializing the TDX module, since > in practice all boot-time present DIMM is TDX convertible memory. If > any new memory has been hot-added, then initializing the TDX module will > fail due to that memory region is not covered by CMR. > > This can be enhanced in the future, i.e. by allowing adding non-TDX > memory to a separate NUMA node. In this case, the "TDX-capable" nodes > and the "non-TDX-capable" nodes can co-exist, but the kernel/userspace > needs to guarantee memory pages for TDX guests are always allocated from > the "TDX-capable" nodes. > > Note TDX assumes convertible memory is always physically present during > machine's runtime. A non-buggy BIOS should never support hot-removal of > any convertible memory. This implementation doesn't handle ACPI memory > removal but depends on the BIOS to behave correctly. > > Signed-off-by: Kai Huang <kai.huang@xxxxxxxxx> > --- > > v6 -> v7: > - Changed to use all system memory in memblock at the time of > initializing the TDX module as TDX memory > - Added memory hotplug support > > --- > arch/x86/Kconfig | 1 + > arch/x86/include/asm/tdx.h | 3 + > arch/x86/mm/init_64.c | 10 ++ > arch/x86/virt/vmx/tdx/tdx.c | 183 ++++++++++++++++++++++++++++++++++++ > 4 files changed, 197 insertions(+) > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index dd333b46fafb..b36129183035 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -1959,6 +1959,7 @@ config INTEL_TDX_HOST > depends on X86_64 > depends on KVM_INTEL > depends on X86_X2APIC > + select ARCH_KEEP_MEMBLOCK > help > Intel Trust Domain Extensions (TDX) protects guest VMs from malicious > host and certain physical attacks. This option enables necessary TDX > diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h > index d688228f3151..71169ecefabf 100644 > --- a/arch/x86/include/asm/tdx.h > +++ b/arch/x86/include/asm/tdx.h > @@ -111,9 +111,12 @@ static inline long tdx_kvm_hypercall(unsigned int nr, unsigned long p1, > #ifdef CONFIG_INTEL_TDX_HOST > bool platform_tdx_enabled(void); > int tdx_enable(void); > +bool tdx_cc_memory_compatible(unsigned long start_pfn, unsigned long end_pfn); > #else /* !CONFIG_INTEL_TDX_HOST */ > static inline bool platform_tdx_enabled(void) { return false; } > static inline int tdx_enable(void) { return -ENODEV; } > +static inline bool tdx_cc_memory_compatible(unsigned long start_pfn, > + unsigned long end_pfn) { return true; } > #endif /* CONFIG_INTEL_TDX_HOST */ > > #endif /* !__ASSEMBLY__ */ > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > index 3f040c6e5d13..900341333d7e 100644 > --- a/arch/x86/mm/init_64.c > +++ b/arch/x86/mm/init_64.c > @@ -55,6 +55,7 @@ > #include <asm/uv/uv.h> > #include <asm/setup.h> > #include <asm/ftrace.h> > +#include <asm/tdx.h> > > #include "mm_internal.h" > > @@ -968,6 +969,15 @@ int arch_add_memory(int nid, u64 start, u64 size, > unsigned long start_pfn = start >> PAGE_SHIFT; > unsigned long nr_pages = size >> PAGE_SHIFT; > > + /* > + * For now if TDX is enabled, all pages in the page allocator > + * must be TDX memory, which is a fixed set of memory regions > + * that are passed to the TDX module. Reject the new region > + * if it is not TDX memory to guarantee above is true. > + */ > + if (!tdx_cc_memory_compatible(start_pfn, start_pfn + nr_pages)) > + return -EINVAL; arch_add_memory() does not add memory to the page allocator. For example, memremap_pages() uses arch_add_memory() and explicitly does not release the memory to the page allocator. This check belongs in add_memory_resource() to prevent new memory that violates TDX from being onlined. Hopefully there is also an option to disable TDX from the kernel boot command line to recover memory-hotplug without needing to boot into the BIOS to toggle TDX.