On Tue, 2024-03-19 at 16:56 -0700, Isaku Yamahata wrote: > When we zap a page from the guest, and add it again on TDX even with > the same > GPA, the page is zeroed. We'd like to keep memory contents for those > cases. > > Ok, let me add those whys and drop migration part. Here is the > updated one. > > TDX supports only write-back(WB) memory type for private memory > architecturally so that (virtualized) memory type change doesn't make > sense for private memory. When we remove the private page from the > guest > and re-add it with the same GPA, the page is zeroed. > > Regarding memory type change (mtrr virtualization and lapic page > mapping change), the current implementation zaps pages, and populate s^ > the page with new memory type on the next KVM page fault. ^s > It doesn't work for TDX to have zeroed pages. What does this mean? Above you mention how all the pages are zeroed. Do you mean it doesn't work for TDX to zero a running guest's pages. Which would happen for the operations that would expect the pages could get faulted in again just fine. > Because TDX supports only WB, we > ignore the request for MTRR and lapic page change to not zap private > pages on unmapping for those two cases Hmm. I need to go back and look at this again. It's not clear from the description why it is safe for the host to not zap pages if requested to. I see why the guest wouldn't want them to be zapped. > > TDX Secure-EPT requires removing the guest pages first and leaf > Secure-EPT pages in order. It doesn't allow zap a Secure-EPT entry > that has child pages. It doesn't work with the current TDP MMU > zapping logic that zaps the root page table without touching child > pages. Instead, zap only leaf SPTEs for KVM mmu that has a shared > bit > mask. Could this be better as two patches that each address a separate thing? 1. Leaf only zapping 2. Don't zap for MTRR, etc. > > > > There seems to be an attempt to abstract away the existence of > > Secure- > > EPT in mmu.c, that is not fully successful. In this case the code > > checks kvm_gfn_shared_mask() to see if it needs to handle the > > zapping > > in a way specific needed by S-EPT. It ends up being a little > > confusing > > because the actual check is about whether there is a shared bit. It > > only works because only S-EPT is the only thing that has a > > kvm_gfn_shared_mask(). > > > > Doing something like (kvm->arch.vm_type == KVM_X86_TDX_VM) looks > > wrong, > > but is more honest about what we are getting up to here. I'm not > > sure > > though, what do you think? > > Right, I attempted and failed in zapping case. This is due to the > restriction > that the Secure-EPT pages must be removed from the leaves. the VMX > case (also > NPT, even SNP) heavily depends on zapping root entry as optimization. > > I can think of > - add TDX check. Looks wrong > - Use kvm_gfn_shared_mask(kvm). confusing > - Give other name for this check like zap_from_leafs (or better > name?) > The implementation is same to kvm_gfn_shared_mask() with comment. > - Or we can add a boolean variable to struct kvm Hmm, maybe wrap it in a function like: static inline bool kvm_can_only_zap_leafs(const struct kvm *kvm) { /* A comment explaining what is going on */ return kvm->arch.vm_type == KVM_X86_TDX_VM; } But KVM seems to be a bit more on the open coded side when it comes to things like this, so not sure what maintainers would prefer. My opinion is the kvm_gfn_shared_mask() check is too strange and it's worth a new helper. If that is bad, then just open coded kvm->arch.vm_type == KVM_X86_TDX_VM is the second best I think. I feel both strongly that it should be changed, and unsure what maintainers would prefer. Hopefully one will chime in.