On Wed, Mar 18, 2020 at 11:36:04AM +0100, Paolo Bonzini wrote: > On 17/03/20 19:22, Sean Christopherson wrote: > > On Tue, Mar 17, 2020 at 06:18:37PM +0100, Paolo Bonzini wrote: > >> On 17/03/20 05:52, Sean Christopherson wrote: > >>> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c > >>> index d816f1366943..a77eab5b0e8a 100644 > >>> --- a/arch/x86/kvm/vmx/nested.c > >>> +++ b/arch/x86/kvm/vmx/nested.c > >>> @@ -1123,7 +1123,7 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool ne > >>> } > >>> > >>> if (!nested_ept) > >>> - kvm_mmu_new_cr3(vcpu, cr3, false); > >>> + kvm_mmu_new_cr3(vcpu, cr3, enable_ept); > >> > >> Even if enable_ept == false, we could have already scheduled or flushed > >> the TLB soon due to one of 1) nested_vmx_transition_tlb_flush 2) > >> vpid_sync_context in prepare_vmcs02 3) the processor doing it for > >> !enable_vpid. > >> > >> So for !enable_ept only KVM_REQ_MMU_SYNC is needed, not > >> KVM_REQ_TLB_FLUSH_CURRENT I think. Worth adding a TODO? > > > > Now that you point it out, I think it makes sense to unconditionally pass > > %true here, i.e. rely 100% on nested_vmx_transition_tlb_flush() to do the > > right thing. > > Why doesn't it need KVM_REQ_MMU_SYNC either? Hmm, so if L1 is using VPID, we're ok without a sync. Junaid's INVVPID patch earlier in this series ensures cached roots won't retain unsync'd SPTEs when L1 does INVVPID. If L1 doesn't flush L2's TLB on VM-Entry, it can't expect L2 to recognize changes in the PTEs since the last INVVPID. Per Intel's SDM, INVLPG (and INVPCID) are only required to invalidate entries for the current VPID, i.e. virtual VPID=0 when executed by L1. Operations that architecturally invalidate entries in the TLBs or paging-structure caches independent of VMX operation (e.g., the INVLPG and INVPCID instructions) invalidate linear mappings and combined mappings. They are required to do so only for the current VPID. ^^^^^^^^^^^^^^^^^^^^^^^^^ If L1 isn't using VPID and L0 isn't using EPT, then a sync is required as L1 would expect PTE changes to be recognized without an explicit INVLPG prior to VM-Ennter. So something like this? if (!nested_ept) kvm_mmu_new_cr3(vcpu, cr3, enable_ept || nested_cpu_has_vpid(vmcs12)); The KVM_REQ_TLB_FLUSH_CURRENT request would be redundant with nested_vmx_transition_tlb_flush() when VPID is enabled, and is a (big) nop when VPID is disabled. In either case the overhead is negligible. Ideally this logic would tie into nested_vmx_transition_tlb_flush() in some way, but making that happen may be wishful thinking. > All this should be in a comment as well, of course. Heh, in hindsight that's painfully obvious.