Re: [PATCH 1/2] KVM: nVMX: Stash L1's CR3 in vmcs01.GUEST_CR3 on nested entry w/o EPT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 20/05/19 22:10, Sean Christopherson wrote:
> KVM does not have 100% coverage of VMX consistency checks, i.e. some
> checks that cause VM-Fail may only be detected by hardware during a
> nested VM-Entry.  In such a case, KVM must restore L1's state to the
> pre-VM-Enter state as L2's state has already been loaded into KVM's
> software model.
> 
> L1's CR3 and PDPTRs in particular are loaded from vmcs01.GUEST_*.  But
> when EPT is disabled, the associated fields hold KVM's shadow values,
> not L1's "real" values.  Fortunately, when EPT is disabled the PDPTRs
> come from memory, i.e. are not cached in the VMCS.  Which leaves CR3
> as the sole anomaly.  Handle CR3 by overwriting vmcs01.GUEST_CR3 with
> L1's CR3 during the nested VM-Entry when EPT is disabled *and* nested
> early checks are disabled, so that nested_vmx_restore_host_state() will
> naturally restore the correct vcpu->arch.cr3 from vmcs01.GUEST_CR3.
> 
> Note, these shenanigans work because nested_vmx_restore_host_state()
> does a full kvm_mmu_reset_context(), i.e. unloads the current MMU,
> which guarantees vmcs01.GUEST_CR3 will be rewritten with a new shadow
> CR3 prior to re-entering L1.  Writing vmcs01.GUEST_CR3 is done if and
> only if nested early checks are disabled as "late" VM-Fail should never
> happen in that case (KVM WARNs), and the conditional write avoids the
> need to restore the correct GUEST_CR3 when nested_vmx_check_vmentry_hw()
> fails.
> 
> Reported-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> Fixes: bd18bffca353 ("KVM: nVMX: restore host state in nested_vmx_vmexit for VMFail")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
> ---
>  arch/x86/kvm/vmx/nested.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 1032f068f0b9..92117092f6e9 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -2963,6 +2963,8 @@ int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, bool from_vmentry)
>  	if (kvm_mpx_supported() &&
>  		!(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS))
>  		vmx->nested.vmcs01_guest_bndcfgs = vmcs_read64(GUEST_BNDCFGS);

This hunk needs a comment that says basically the same things that are
in the commit message.

	/* Temporarily overwrite vmcs01.GUEST_CR3 with L1's CR3 during
	 * the nested VM-Entry when EPT is disabled *and* nested early
	 * checks are disabled, because nested_vmx_restore_host_state()
	 * wants to restore the correct vcpu->arch.cr3 from
	 * vmcs01.GUEST_CR3.  On nested vmexit, kvm_mmu_reset_context()
	 * will overwrite GUEST_CR3 with the shadow CR3 prior to
	 * re-entering L1.  This is not needed when nested early checks
	 * are enabled, because it doesn't reload the MMU until
	 * after the checks have succeeded.
	 */

And also, the restore of the correct vcpu->arch.cr3 from
vmcs01.GUEST_CR3 can be moved to this patch otherwise the comment would
not be accurate.

But, as I said in the review of patch 2, I'm a bit lost as to how
kvm_mmu_reset_context() is enough to reload the shadow CR3 into the vmcs01.

Paolo

> +	if (!enable_ept && !nested_early_check)
> +		vmcs_writel(GUEST_CR3, vcpu->arch.cr3);
>  
>  	vmx_switch_vmcs(vcpu, &vmx->nested.vmcs02);
>  
> @@ -3794,7 +3796,8 @@ static void nested_vmx_restore_host_state(struct kvm_vcpu *vcpu)
>  	 * VMFail, like everything else we just need to ensure our
>  	 * software model is up-to-date.
>  	 */
> -	ept_save_pdptrs(vcpu);
> +	if (enable_ept)
> +		ept_save_pdptrs(vcpu);
>  
>  	kvm_mmu_reset_context(vcpu);
>  
> 




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux