On Thu, Jan 19, 2023 at 4:16 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Thu, Jan 19, 2023, Huang, Kai wrote: > > On Thu, 2023-01-19 at 21:36 +0000, Sean Christopherson wrote: > > > The least invasive idea I have is expand the TDP MMU's concept of "frozen" SPTEs > > > and freeze (a.k.a. lock) the SPTE (KVM's mirror) until the corresponding S-EPT > > > update completes. > > > > This will introduce another "having-to-wait while SPTE is frozen" problem I > > think, which IIUC means (one way is) you have to do some loop and retry, perhaps > > similar to yield_safe. > > Yes, but because the TDP MMU already freezes SPTEs (just for a shorter duration), > I'm 99% sure all of the affected flows already know how to yield/bail when necessary. > > The problem with the zero-step mitigation is that it could (theoretically) cause > a "busy" error on literally any accesses, which makes it infeasible for KVM to have > sane behavior. E.g. freezing SPTEs to avoid the ordering issues isn't necessary > when holding mmu_lock for write, whereas the zero-step madness brings everything > into play. (I'm still ramping up on TDX so apologies in advance if the following is totally off base.) The complexity, and to a lesser extent the memory overhead, of mirroring Secure EPT tables with the TDP MMU makes me wonder if it is really worth it. Especially since the initial TDX support has so many constraints that would seem to allow a simpler implementation: all private memory is pinned, no live migration support, no test/clear young notifiers, etc. For the initial version of KVM TDX support, what if we implemented the Secure EPT management entirely off to the side? i.e. Not on top of the TDP MMU. For example, write TDX-specific routines for: - Fully populating the Secure EPT tree some time during VM creation. - Tearing down the Secure EPT tree during VM destruction. - Support for unmapping/mapping specific regions of the Secure EPT tree for private<->shared conversions. With that in place, KVM never would need to handle a fault on a Secure EPT mapping. Any fault (e.g. due to an in-progress private<->shared conversion) can just return back to the guest to retry the memory access until the operation is complete. If we start with only supporting 4K pages in the Secure EPT, the Secure EPT routines described above would be almost trivial to implement. Huge Pages would add some complexity, but I don't think it would be terrible. Concurrency can be handled with a single lock since we don't have to worry about concurrent faulting. This would avoid having TDX add a bunch of complexity to the TDP MMU (which would only be used for shared mappings). If and when we want to have more complicated memory management for TDX private mappings, we could revisit TDP MMU integration. But I think this design could even get us to the point of supporting Dirty Logging (where the only fault KVM would have to handle for TDX private mappings would be write-protection faults). I'm not sure it would work for Demand-Paging (at least the performance would not be great behind a single lock), but we can cross that bridge when we get there.