On Sat, Feb 19, 2022, Paolo Bonzini wrote: > On 2/18/22 22:45, Sean Christopherson wrote: > > On Thu, Feb 17, 2022, Paolo Bonzini wrote: > > > Whenever KVM knows the page role flags have changed, it needs to drop > > > the current MMU root and possibly load one from the prev_roots cache. > > > Currently it is papering over some overly simplistic code by just > > > dropping _all_ roots, so that the root will be reloaded by > > > kvm_mmu_reload, but this has bad performance for the TDP MMU > > > (which drops the whole of the page tables when freeing a root, > > > without the performance safety net of a hash table). > > > > > > To do this, KVM needs to do a more kvm_mmu_update_root call from > > > kvm_mmu_reset_context. Introduce a new request bit so that the call > > > can be delayed until after a possible KVM_REQ_MMU_RELOAD, which would > > > kill all hopes of finding a cached PGD. > > > > > > Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > > > --- > > > > Please no. > > > > I really, really do not want to add yet another deferred-load in the nested > > virtualization paths. > > This is not a deferred load, is it? It's only kvm_mmu_new_pgd that is > deferred, but the PDPTR load is not. Yeah, I'm referring to kvm_mmu_new_pgd(). > > I strongly prefer that we take a more conservative approach and fix 7+8, and then > > tackle 1, 3, and 4+5 separately if someone cares enough about those flows to avoid > > dropping roots. > > The thing is, I want to get rid of kvm_mmu_reset_context() altogether. I > dislike the fact that it kills the roots but still keeps them in the hash > table, thus relying on separate syncing to avoid future bugs. It's very > unintuitive what is "reset" and what isn't. I agree with all of the above, I just don't think that forcing the issue is going to be a net positive in the long run. > > Regarding KVM_REQ_MMU_RELOAD, that mess mostly goes away with my series to replace > > that with KVM_REQ_MMU_FREE_OBSOLETE_ROOTS. Obsolete TDP MMU roots will never get > > a cache hit because the obsolete root will have an "invalid" role. And if we care > > about optimizing this with respect to a memslot (highly unlikely), then we could > > add an MMU generation check in the cache lookup. I was planning on posting that > > series as soon as this one is queued, but I'm more than happy to speculatively send > > a refreshed version that applies on top of this series. > > Yes, please send a version on top of patches 1-13. That can be reviewed and > committed in parallel with the root_role changes. Will do.