On 7/21/22 08:14, Borislav Petkov wrote: > On Tue, Jun 14, 2022 at 03:02:19PM +0300, Kirill A. Shutemov wrote: >> On-demand memory accept means latency spikes every time kernel steps >> onto a new memory block. The spikes will go away once workload data >> set size gets stabilized or all memory gets accepted. > What does that mean? > > If we're accepting 2M pages and considering referential locality, how > are those "spikes" even noticeable? Acceptance is slow and the heavy lifting is done inside the TDX module. It involves flushing old aliases out of the caches and initializing the memory integrity metadata for every cacheline. This implementation does acceptance in 2MB chunks while holding a global lock. So, those (effective) 2MB clflush+memset's (plus a few thousand cycles for the hypercall/tdcall transitions) can't happen in parallel. They are serialized and must wait on each other. If you have a few hundred CPUs all trying to allocate memory (say, doing the first kernel compile after a reboot), this is going to be very, very painful for a while. That said, I think this is the right place to _start_. There is going to need to be some kind of follow-on solution (likely background acceptance of some kind). But, even with that solution, *this* code is still needed to handle the degenerate case where the background accepter can't keep up with foreground memory needs.