Since this series is pending in mail list for long time. And it's really a big feature for Nested. Also, I doubt the original authors(Jun and Nahav)should not have enough time to continue it. So I will pick it up. :) See comments below: Paolo Bonzini wrote on 2013-05-20: > Il 19/05/2013 06:52, Jun Nakajima ha scritto: > > From: Nadav Har'El <nyh@xxxxxxxxxx> > > > > Recent KVM, since > http://kerneltrap.org/mailarchive/linux-kvm/2010/5/2/6261577 > > switch the EFER MSR when EPT is used and the host and guest have different > > NX bits. So if we add support for nested EPT (L1 guest using EPT to run L2) > > and want to be able to run recent KVM as L1, we need to allow L1 to use this > > EFER switching feature. > > > > To do this EFER switching, KVM uses VM_ENTRY/EXIT_LOAD_IA32_EFER if > available, > > and if it isn't, it uses the generic VM_ENTRY/EXIT_MSR_LOAD. This patch adds > > support for the former (the latter is still unsupported). > > > > Nested entry and exit emulation (prepare_vmcs_02 and > load_vmcs12_host_state, > > respectively) already handled VM_ENTRY/EXIT_LOAD_IA32_EFER correctly. So > all > > that's left to do in this patch is to properly advertise this feature to L1. > > > > Note that vmcs12's VM_ENTRY/EXIT_LOAD_IA32_EFER are emulated by L0, by > using > > vmx_set_efer (which itself sets one of several vmcs02 fields), so we always > > support this feature, regardless of whether the host supports it. > > > > Signed-off-by: Nadav Har'El <nyh@xxxxxxxxxx> > > Signed-off-by: Jun Nakajima <jun.nakajima@xxxxxxxxx> > > Signed-off-by: Xinhao Xu <xinhao.xu@xxxxxxxxx> > > --- > > arch/x86/kvm/vmx.c | 23 ++++++++++++++++------- > > 1 file changed, 16 insertions(+), 7 deletions(-) > > > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > > index 260a919..fb9cae5 100644 > > --- a/arch/x86/kvm/vmx.c > > +++ b/arch/x86/kvm/vmx.c > > @@ -2192,7 +2192,8 @@ static __init void nested_vmx_setup_ctls_msrs(void) > > #else > > nested_vmx_exit_ctls_high = 0; > > #endif > > - nested_vmx_exit_ctls_high |= > VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR; > > + nested_vmx_exit_ctls_high |= > (VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR | > > + VM_EXIT_LOAD_IA32_EFER); > > > > /* entry controls */ > > rdmsr(MSR_IA32_VMX_ENTRY_CTLS, > > @@ -2201,8 +2202,8 @@ static __init void nested_vmx_setup_ctls_msrs(void) > > nested_vmx_entry_ctls_low = > VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR; > > nested_vmx_entry_ctls_high &= > > VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_IA32E_MODE; > > - nested_vmx_entry_ctls_high |= > VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR; > > - > > + nested_vmx_entry_ctls_high |= > (VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR | > > + VM_ENTRY_LOAD_IA32_EFER); > > /* cpu-based controls */ > > rdmsr(MSR_IA32_VMX_PROCBASED_CTLS, > > nested_vmx_procbased_ctls_low, nested_vmx_procbased_ctls_high); > > @@ -7492,10 +7493,18 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, > struct vmcs12 *vmcs12) > > vcpu->arch.cr0_guest_owned_bits &= ~vmcs12->cr0_guest_host_mask; > > vmcs_writel(CR0_GUEST_HOST_MASK, > ~vcpu->arch.cr0_guest_owned_bits); > > > > - /* Note: IA32_MODE, LOAD_IA32_EFER are modified by vmx_set_efer > below */ > > - vmcs_write32(VM_EXIT_CONTROLS, > > - vmcs12->vm_exit_controls | vmcs_config.vmexit_ctrl); > > - vmcs_write32(VM_ENTRY_CONTROLS, vmcs12->vm_entry_controls | > > + /* L2->L1 exit controls are emulated - the hardware exit is to L0 so > > + * we should use its exit controls. Note that IA32_MODE, LOAD_IA32_EFER > > + * bits are further modified by vmx_set_efer() below. > > + */ > > + vmcs_write32(VM_EXIT_CONTROLS, vmcs_config.vmexit_ctrl); This is wrong. We cannot use L0 exit control directly. LOAD_PERF_GLOBAL_CTRL, LOAD_HOST_EFE, LOAD_HOST_PAT, ACK_INTR_ON_EXIT should use host's exit control. But others, still need use (vmcs12|host). > > + > > + /* vmcs12's VM_ENTRY_LOAD_IA32_EFER and VM_ENTRY_IA32E_MODE > are > > + * emulated by vmx_set_efer(), below. > > VM_ENTRY_LOAD_IA32_EFER is not emulated by vmx_set_efer, so: VM_ENTRY_LOAD_IA32_EFER is hanlded in setup_msrs(), and vmx_set_efer already call it. > > /* vmcs12's VM_ENTRY_LOAD_IA32_EFER and VM_ENTRY_IA32E_MODE > * are emulated below. VM_ENTRY_IA32E_MODE is handled in > * vmx_set_efer(). */ > > Paolo > > > + */ > > + vmcs_write32(VM_ENTRY_CONTROLS, > > + (vmcs12->vm_entry_controls & ~VM_ENTRY_LOAD_IA32_EFER & > > + ~VM_ENTRY_IA32E_MODE) | > > (vmcs_config.vmentry_ctrl & ~VM_ENTRY_IA32E_MODE)); > > > > if (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PAT) > > > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html Best regards, Yang -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html