On Tue, Aug 06, 2013 at 10:18:17AM +0200, Jan Kiszka wrote: > On 2013-08-06 10:00, Gleb Natapov wrote: > > On Tue, Aug 06, 2013 at 09:55:09AM +0200, Jan Kiszka wrote: > >> On 2013-08-06 09:51, Gleb Natapov wrote: > >>> On Tue, Aug 06, 2013 at 09:47:23AM +0200, Jan Kiszka wrote: > >>>> On 2013-08-05 13:40, Gleb Natapov wrote: > >>>>> On Mon, Aug 05, 2013 at 07:27:33PM +0800, Arthur Chunqi Li wrote: > >>>>>> On Mon, Aug 5, 2013 at 4:07 PM, Gleb Natapov <gleb@xxxxxxxxxx> wrote: > >>>>>>> From: Nadav Har'El <nyh@xxxxxxxxxx> > >>>>>>> > >>>>>>> Recent KVM, since http://kerneltrap.org/mailarchive/linux-kvm/2010/5/2/6261577 > >>>>>>> switch the EFER MSR when EPT is used and the host and guest have different > >>>>>>> NX bits. So if we add support for nested EPT (L1 guest using EPT to run L2) > >>>>>>> and want to be able to run recent KVM as L1, we need to allow L1 to use this > >>>>>>> EFER switching feature. > >>>>>>> > >>>>>>> To do this EFER switching, KVM uses VM_ENTRY/EXIT_LOAD_IA32_EFER if available, > >>>>>>> and if it isn't, it uses the generic VM_ENTRY/EXIT_MSR_LOAD. This patch adds > >>>>>>> support for the former (the latter is still unsupported). > >>>>>>> > >>>>>>> Nested entry and exit emulation (prepare_vmcs_02 and load_vmcs12_host_state, > >>>>>>> respectively) already handled VM_ENTRY/EXIT_LOAD_IA32_EFER correctly. So all > >>>>>>> that's left to do in this patch is to properly advertise this feature to L1. > >>>>>>> > >>>>>>> Note that vmcs12's VM_ENTRY/EXIT_LOAD_IA32_EFER are emulated by L0, by using > >>>>>>> vmx_set_efer (which itself sets one of several vmcs02 fields), so we always > >>>>>>> support this feature, regardless of whether the host supports it. > >>>>>>> > >>>>>>> Reviewed-by: Orit Wasserman <owasserm@xxxxxxxxxx> > >>>>>>> Signed-off-by: Nadav Har'El <nyh@xxxxxxxxxx> > >>>>>>> Signed-off-by: Jun Nakajima <jun.nakajima@xxxxxxxxx> > >>>>>>> Signed-off-by: Xinhao Xu <xinhao.xu@xxxxxxxxx> > >>>>>>> Signed-off-by: Yang Zhang <yang.z.zhang@xxxxxxxxx> > >>>>>>> Signed-off-by: Gleb Natapov <gleb@xxxxxxxxxx> > >>>>>>> --- > >>>>>>> arch/x86/kvm/vmx.c | 23 ++++++++++++++++------- > >>>>>>> 1 file changed, 16 insertions(+), 7 deletions(-) > >>>>>>> > >>>>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > >>>>>>> index e999dc7..27efa6a 100644 > >>>>>>> --- a/arch/x86/kvm/vmx.c > >>>>>>> +++ b/arch/x86/kvm/vmx.c > >>>>>>> @@ -2198,7 +2198,8 @@ static __init void nested_vmx_setup_ctls_msrs(void) > >>>>>>> #else > >>>>>>> nested_vmx_exit_ctls_high = 0; > >>>>>>> #endif > >>>>>>> - nested_vmx_exit_ctls_high |= VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR; > >>>>>>> + nested_vmx_exit_ctls_high |= (VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR | > >>>>>>> + VM_EXIT_LOAD_IA32_EFER); > >>>>>> Gleb, why we don't need to check whether host supports > >>>>>> VM_EXIT_LOAD_IA32_EFER here, as what you noted in my > >>>>>> VM_EXIT_LOAD_IA32_PAT patch? > >>>>> Nested VMX completely emulates the capability. > >>>> > >>>> No, it doesn't. The values for host/guest are handled over via the > >>>> corresponding VMCS fields, physically, even though the actual loading is > >>>> emulated then. So we must not expose this feature unconditionally. > >>> Can you show me the code where it happens? > >> > >> When the guest writes to HOST/GUEST_IA32_EFER, we will store this in the > >> vmcs that will then become the active one on next L1/L2 entry, no? > >> > > Guest writes are stored in vmcs12 which is not HW vmcs, just a format > > kvm uses internally (see vmcs12_write_any). During guest entry vmcs02 > > is constructed from vmcs12 (see prepare_vmcs02) and this function does > > not access HOST/GUEST_IA32_EFER directly, it uses vmx_set_efer instead > > which takes care of things. Same function access GUEST_IA32_PAT directly > > though. > > OK, right. > > Is it also safe to write to any field of a shadow VMCS? That's currently > hypothetical, but what if the host supports shadowing but not EFER > loading? I don't see a technical reason but also no clear statement in > the SDM regarding this scenario. > We do not shadow all VMCS fields, only some of them (see shadow_read_write_fields array). HOST/GUEST_IA32_EFER is not the one we shadow. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html