On 8/10/21 10:51 AM, Kirill A. Shutemov wrote: > On Tue, Aug 10, 2021 at 10:36:21AM -0700, Dave Hansen wrote: >>> The difference is going to be substantially lower once we get it optimized >>> properly. >> What does this mean? Is this future work in the kernel or somewhere in >> the TDX hardware/firmware which will speed things up? > Kernel has to be changed to accept memory in 2M and 1G chunks where > possible. The interface exists and described in spec, but not yet used in > guest kernel. >From a quick scan of the spec, I only see: > 7.9.3. Page Acceptance by the Guest TD: TDG.MEM.PAGE.ACCEPT ... The guest > TD can accept a dynamically added 4KB page using TDG.MEM.PAGE.ACCEPT > with the page GPA as an input. Is there some other 2M/1G page-acceptance call that I'm missing? > It would cut hypercall overhead dramatically. It makes upfront memory > accept more bearable and lowers latency of lazy memory accept. So I expect > the gap being not 20x, but like 3-5x (which is still huge). It would be nice to be able to judge the benefits of this series based on the final form. I guess we'll take what we can get, though. Either way, I'd still like to see some *actual* numbers for at least one configuration: With this series applied, userspace starts to run at X seconds after kernel boot. Without this series, userspace runs at Y seconds.