On Thu, Jun 06, 2019 at 02:22:56PM +0200, Paolo Bonzini wrote: > On 20/05/19 22:10, Sean Christopherson wrote: > > @@ -3777,18 +3777,8 @@ static void nested_vmx_restore_host_state(struct kvm_vcpu *vcpu) > > vmx_set_cr4(vcpu, vmcs_readl(CR4_READ_SHADOW)); > > > > nested_ept_uninit_mmu_context(vcpu); > > - > > - /* > > - * This is only valid if EPT is in use, otherwise the vmcs01 GUEST_CR3 > > - * points to shadow pages! Fortunately we only get here after a WARN_ON > > - * if EPT is disabled, so a VMabort is perfectly fine. > > - */ > > - if (enable_ept) { > > - vcpu->arch.cr3 = vmcs_readl(GUEST_CR3); > > - __set_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_avail); > > - } else { > > - nested_vmx_abort(vcpu, VMX_ABORT_VMCS_CORRUPTED); > > - } > > + vcpu->arch.cr3 = vmcs_readl(GUEST_CR3); > > + __set_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_avail); > > > > /* > > * Use ept_save_pdptrs(vcpu) to load the MMU's cached PDPTRs > > This hunk needs to be moved to patch 1, which then becomes much easier > to understand... I kept the revert in a separate patch so that the bug fix could be easily backported to stable branches (commit 2b27924bb1d4 ("KVM: nVMX: always use early vmcs check when EPT is disabled" wasn't tagged for stable). > I'm still missing however the place where kvm_mmu_new_cr3 is called > in the nested_vmx_restore_host_state path. vcpu->arch.root_mmu.root_hpa is set to INVALID_PAGE via: nested_vmx_restore_host_state() -> kvm_mmu_reset_context() -> kvm_mmu_unload() -> kvm_mmu_free_roots() kvm_mmu_unload() has WARN_ON(root_hpa != INVALID_PAGE), i.e. we can bank on 'root_hpa == INVALID_PAGE' unless the implementation of kvm_mmu_reset_context() is changed. On the way into L1, VMCS.GUEST_CR3 is guaranteed to be written (on a successful entry) via: vcpu_enter_guest() -> kvm_mmu_reload() -> kvm_mmu_load() -> kvm_mmu_load_cr3() -> vmx_set_cr3() The optimization in kvm_mmu_reload() will fail because kvm_mmu_unload() set vcpu->arch.root_mmu.root_hpa=INVALID_PAGE, and vcpu->arch.mmu is guaranteed to point at root_mmu (via nested_ept_uninit_mmu_context()).