From: Lai Jiangshan <laijs@xxxxxxxxxxxxxxxxx> For VM-Enter, vmcs.GUEST_CR3 and vcpu->arch.cr3 are synced and it is better to mark VCPU_EXREG_CR3 available rather than dirty to reduce a redundant vmwrite(GUEST_CR3) in vmx_load_mmu_pgd(). But nested_vmx_load_cr3() is also served for VM-Exit which doesn't set vmcs.GUEST_CR3. This patch moves writing to vmcs.GUEST_CR3 into nested_vmx_load_cr3() for both nested VM-Eneter/Exit and use kvm_register_mark_available(). This patch doesn't cause any extra writing to vmcs.GUEST_CR3 and if userspace is modifying CR3 with KVM_SET_SREGS later, the dirty info for VCPU_EXREG_CR3 would be set for next writing to vmcs.GUEST_CR3 and no update will be lost. Signed-off-by: Lai Jiangshan <laijs@xxxxxxxxxxxxxxxxx> --- arch/x86/kvm/vmx/nested.c | 32 +++++++++++++++++++++----------- 1 file changed, 21 insertions(+), 11 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index ee5a68c2ea3a..4ddd4b1b0503 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -1133,8 +1133,28 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, if (!nested_ept) kvm_mmu_new_pgd(vcpu, cr3); + /* + * Immediately write vmcs.GUEST_CR3 when changing vcpu->arch.cr3. + * + * VCPU_EXREG_CR3 is marked available rather than dirty because + * vcpu->arch.cr3 and vmcs.GUEST_CR3 are synced when enable_ept and + * vmcs.GUEST_CR3 is irrelevant to vcpu->arch.cr3 when !enable_ept. + * + * For VM-Enter case, it will be propagated to vmcs12 on nested + * VM-Exit, which can occur without actually running L2 and thus + * without hitting vmx_load_mmu_pgd(), e.g. if L1 is entering L2 with + * vmcs12.GUEST_ACTIVITYSTATE=HLT, in which case KVM will intercept + * the transition to HLT instead of running L2. + * + * For VM-Exit case, it is likely that vmcs.GUEST_CR3 == cr3 here, but + * L1 may set HOST_CR3 to a value other than its CR3 before VM-Entry, + * so we just update it unconditionally. + */ + if (enable_ept) + vmcs_writel(GUEST_CR3, cr3); + vcpu->arch.cr3 = cr3; - kvm_register_mark_dirty(vcpu, VCPU_EXREG_CR3); + kvm_register_mark_available(vcpu, VCPU_EXREG_CR3); /* Re-initialize the MMU, e.g. to pick up CR4 MMU role changes. */ kvm_init_mmu(vcpu); @@ -2600,16 +2620,6 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, from_vmentry, entry_failure_code)) return -EINVAL; - /* - * Immediately write vmcs02.GUEST_CR3. It will be propagated to vmcs12 - * on nested VM-Exit, which can occur without actually running L2 and - * thus without hitting vmx_load_mmu_pgd(), e.g. if L1 is entering L2 with - * vmcs12.GUEST_ACTIVITYSTATE=HLT, in which case KVM will intercept the - * transition to HLT instead of running L2. - */ - if (enable_ept) - vmcs_writel(GUEST_CR3, vmcs12->guest_cr3); - /* Late preparation of GUEST_PDPTRs now that EFER and CRs are set. */ if (load_guest_pdptrs_vmcs12 && nested_cpu_has_ept(vmcs12) && is_pae_paging(vcpu)) { -- 2.19.1.6.gb485710b