On Fri, Apr 29, 2022, Dave Hansen wrote: > On 4/29/22 00:46, Kai Huang wrote: > > On Thu, 2022-04-28 at 10:12 -0700, Dave Hansen wrote: > >> This is also a good place to note the downsides of using > >> alloc_contig_pages(). > > > > For instance: > > > > The allocation may fail when memory usage is under pressure. > > It's not really memory pressure, though. The larger the allocation, the > more likely it is to fail. The more likely it is that the kernel can't > free the memory or that if you need 1GB of contiguous memory that > 999.996MB gets freed, but there is one stubborn page left. > > alloc_contig_pages() can and will fail. The only mitigation which is > guaranteed to avoid this is doing the allocation at boot. But, you're > not doing that to avoid wasting memory on every TDX system that doesn't > use TDX. > > A *good* way (although not foolproof) is to launch a TDX VM early in > boot before memory gets fragmented or consumed. You might even want to > recommend this in the documentation. What about providing a kernel param to tell the kernel to do the allocation during boot? Or maybe a sysfs knob to reserve/free the memory, a la nr_overcommit_hugepages? I suspect that most/all deployments that actually want to use TDX would much prefer to eat the overhead if TDX VMs are never scheduled on the host, as opposed to having to deal with a host in a TDX pool not actually being able to run TDX VMs.