On 7/22/22 12:18, Borislav Petkov wrote: > On Thu, Jul 21, 2022 at 08:49:31AM -0700, Dave Hansen wrote: >> So, those (effective) 2MB clflush+memset's (plus a few thousand cycles >> for the hypercall/tdcall transitions) > > So this sounds strange - page validation on AMD - judging by the > pseudocode of the PVALIDATE insn - does a bunch of sanity checks on the > gVA of the page and then installs it into the RMP and also "PVALIDATE > performs the same segmentation and paging checks as a 1-byte read. > PVALIDATE does not invalidate TLB caches." > > But that still sounds a lot less work than what the TDX module needs to > do... Sure does... *Something* has to manage the cache coherency so that old physical aliases of the converted memory don't write back and clobber new data. But, maybe the hardware is doing that now. >> If you have a few hundred CPUs all trying to allocate memory (say, >> doing the first kernel compile after a reboot), this is going to be >> very, very painful for a while. >> >> That said, I think this is the right place to _start_. There is going >> to need to be some kind of follow-on solution (likely background >> acceptance of some kind). But, even with that solution, *this* code >> is still needed to handle the degenerate case where the background >> accepter can't keep up with foreground memory needs. > > I'm still catering to the view that it should be a two-tier thing: you > validate during boot a certain amount - say 4G - a size for which the > boot delay is acceptable and you do the rest on-demand along with a > background accepter. > > That should give you the best of both worlds... Yeah, that two-tier system is the way it's happening today from what I understand. This whole conversation is about how to handle the >4GB memory.