Re: [PATCH v2] KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



2016-11-29 09:49+0100, Ladi Prosek:
> KVM does not correctly handle L1 hypervisors that emulate L2 real mode with
> PAE and EPT, such as Hyper-V. In this mode, the L1 hypervisor populates guest
> PDPTE VMCS fields and leaves guest CR3 uninitialized because it is not used
> (see 26.3.2.4 Loading Page-Directory-Pointer-Table Entries). KVM always
> dereferences CR3 and tries to load PDPTEs if PAE is on. This leads to two
> related issues:
> 
> 1) On the first nested vmentry, the guest PDPTEs, as populated by L1, are
> overwritten in ept_load_pdptrs because the registers are believed to have
> been loaded in load_pdptrs as part of kvm_set_cr3. This is incorrect. L2 is
> running with PAE enabled but PDPTRs have been set up by L1.
> 
> 2) When L2 is about to enable paging and loads its CR3, we, again, attempt
> to load PDPTEs in load_pdptrs called from kvm_set_cr3. There are no guarantees
> that this will succeed (it's just a CR3 load, paging is not enabled yet) and
> if it doesn't, kvm_set_cr3 returns early without persisting the CR3 which is
> then lost and L2 crashes right after it enables paging.
> 
> This patch replaces the kvm_set_cr3 call on nested entry and exit with a new
> function nested_vmx_load_cr3 which correctly handles PAE+EPT, performs CR3
> validity checks per the specification, and does not do unnecessary and
> duplicated MMU work.
> 
> Signed-off-by: Ladi Prosek <lprosek@xxxxxxxxxx>
> ---
> 
> tl;dr This makes Hyper-V (Windows Server 2016) work on KVM.
> 
> v1->v2:
> 
> * introduced nested_vmx_load_cr3 (more error handling, less MMU work)
> * also handles nested exit in addition to nested entry
> * amended commit description

There is a minor problem with abort in this patch ... and when you'll be
sending v3, please split it into more patches -- maybe it would be
nicest if you reused the patch from v1, just with is_long_mode() fixup
and then added patch(es) for error propagation from prepare_vmcs02 and
other cr3 checks.

(If you have other pressing issues, just the first fixed patch would be
 ok for the time being -- I can change it directly if you wish.)

> diff --git a/arch/x86/include/uapi/asm/vmx.h b/arch/x86/include/uapi/asm/vmx.h
> @@ -134,5 +134,6 @@
>  
>  #define VMX_ABORT_SAVE_GUEST_MSR_FAIL        1
>  #define VMX_ABORT_LOAD_HOST_MSR_FAIL         4
> +#define VMX_ABORT_LOAD_HOST_CR3_FAIL         5

Number 5 is suspicious ... SDM (27.7 VMX ABORTS) says it means:

  5. There was a machine-check event during VM exit (see Section 27.8).

You'll want number 2:

  2. Host checking of the page-directory-pointer-table entries (PDPTEs)
     failed (see Section 27.5.4).

> @@ -10670,8 +10723,8 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
>  
>  	nested_ept_uninit_mmu_context(vcpu);
>  
> -	kvm_set_cr3(vcpu, vmcs12->host_cr3);
> -	kvm_mmu_reset_context(vcpu);
> +	if (nested_vmx_load_cr3(vcpu, vmcs12->host_cr3, false, &entry_failure_code))
> +		nested_vmx_abort(vcpu, VMX_ABORT_LOAD_HOST_CR3_FAIL);

Only loading PDPTEs can fail, because host_cr3 cannot change and its
reserved bits are checked during VM entry, 26.2.2 (Checks on Host
Control Registers and MSRs).

Checking the host state during VM entry is slightly different and
described in 26.2 (CHECKS ON VMX CONTROLS AND HOST-STATE AREA).
KVM uses nested_vmx_failValid() to report an error as it happens before
trying to load the guest state, which means there is no VM exit.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux