RE: [PATCH v3 01/13] nEPT: Support LOAD_IA32_EFER entry/exit controls for L1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Gleb Natapov wrote on 2013-07-02:
> On Tue, Jul 02, 2013 at 05:34:56PM +0200, Jan Kiszka wrote:
>> On 2013-07-02 17:15, Gleb Natapov wrote:
>>> On Tue, Jul 02, 2013 at 04:28:56PM +0200, Jan Kiszka wrote:
>>>> On 2013-07-02 15:59, Gleb Natapov wrote:
>>>>> On Tue, Jul 02, 2013 at 03:01:24AM +0000, Zhang, Yang Z wrote:
>>>>>> Since this series is pending in mail list for long time. And
>>>>>> it's really a big feature for Nested. Also, I doubt the
>>>>>> original authors(Jun and Nahav)should not have enough time to continue it.
>>>>>> So I will pick it up. :)
>>>>>> 
>>>>>> See comments below:
>>>>>> 
>>>>>> Paolo Bonzini wrote on 2013-05-20:
>>>>>>> Il 19/05/2013 06:52, Jun Nakajima ha scritto:
>>>>>>>> From: Nadav Har'El <nyh@xxxxxxxxxx>
>>>>>>>> 
>>>>>>>> Recent KVM, since
>>>>>>>> http://kerneltrap.org/mailarchive/linux-kvm/2010/5/2/6261577
>>>>>>>> switch the EFER MSR when EPT is used and the host and guest have
>>>>>>>> different NX bits. So if we add support for nested EPT (L1 guest
>>>>>>>> using EPT to run L2) and want to be able to run recent KVM as L1,
>>>>>>>> we need to allow L1 to use this EFER switching feature.
>>>>>>>> 
>>>>>>>> To do this EFER switching, KVM uses VM_ENTRY/EXIT_LOAD_IA32_EFER
>>>>>>>> if available, and if it isn't, it uses the generic
>>>>>>>> VM_ENTRY/EXIT_MSR_LOAD. This patch adds support for the former
>>>>>>>> (the latter is still unsupported).
>>>>>>>> 
>>>>>>>> Nested entry and exit emulation (prepare_vmcs_02 and
>>>>>>>> load_vmcs12_host_state, respectively) already handled
>>>>>>>> VM_ENTRY/EXIT_LOAD_IA32_EFER correctly. So all that's left to do
>>>>>>>> in this patch is to properly advertise this feature to L1.
>>>>>>>> 
>>>>>>>> Note that vmcs12's VM_ENTRY/EXIT_LOAD_IA32_EFER are emulated by
>>>>>>>> L0, by using vmx_set_efer (which itself sets one of several
>>>>>>>> vmcs02 fields), so we always support this feature, regardless of
>>>>>>>> whether the host supports it.
>>>>>>>> 
>>>>>>>> Signed-off-by: Nadav Har'El <nyh@xxxxxxxxxx>
>>>>>>>> Signed-off-by: Jun Nakajima <jun.nakajima@xxxxxxxxx>
>>>>>>>> Signed-off-by: Xinhao Xu <xinhao.xu@xxxxxxxxx>
>>>>>>>> ---
>>>>>>>>  arch/x86/kvm/vmx.c | 23 ++++++++++++++++-------
>>>>>>>>  1 file changed, 16 insertions(+), 7 deletions(-)
>>>>>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index
>>>>>>>> 260a919..fb9cae5 100644
>>>>>>>> --- a/arch/x86/kvm/vmx.c
>>>>>>>> +++ b/arch/x86/kvm/vmx.c
>>>>>>>> @@ -2192,7 +2192,8 @@ static __init void
>>>>>>>> nested_vmx_setup_ctls_msrs(void)  #else
>>>>>>>>  	nested_vmx_exit_ctls_high = 0;  #endif
>>>>>>>> -	nested_vmx_exit_ctls_high |= VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR;
>>>>>>>> +	nested_vmx_exit_ctls_high |= (VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR
>>>>>>>> | +				      VM_EXIT_LOAD_IA32_EFER);
>>>>>>>> 
>>>>>>>>  	/* entry controls */
>>>>>>>>  	rdmsr(MSR_IA32_VMX_ENTRY_CTLS, @@ -2201,8 +2202,8
> @@ static
>>>>>>>> __init void nested_vmx_setup_ctls_msrs(void)
>>>>>>>>  	nested_vmx_entry_ctls_low = VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR;
>>>>>>>>  	nested_vmx_entry_ctls_high &= 		VM_ENTRY_LOAD_IA32_PAT |
>>>>>>>>  VM_ENTRY_IA32E_MODE;
>>>>>>>> -	nested_vmx_entry_ctls_high |=
>>>>>>>> VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR; -
>>>>>>>> +	nested_vmx_entry_ctls_high |=
>>>>>>>> (VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR | +				      
>>>>>>>> VM_ENTRY_LOAD_IA32_EFER);
>>>>>>>>  	/* cpu-based controls */
>>>>>>>>  	rdmsr(MSR_IA32_VMX_PROCBASED_CTLS,
>>>>>>>>  		nested_vmx_procbased_ctls_low,
>>>>>>>> nested_vmx_procbased_ctls_high); @@ -7492,10 +7493,18 @@ static
>>>>>>>> void prepare_vmcs02(struct kvm_vcpu *vcpu,
>>>>>>> struct vmcs12 *vmcs12)
>>>>>>>>  	vcpu->arch.cr0_guest_owned_bits &=
>>>>>>>>  ~vmcs12->cr0_guest_host_mask; 	vmcs_writel(CR0_GUEST_HOST_MASK,
>>>>>>> ~vcpu->arch.cr0_guest_owned_bits);
>>>>>>>> 
>>>>>>>> -	/* Note: IA32_MODE, LOAD_IA32_EFER are modified by
> vmx_set_efer
>>>>>>> below */
>>>>>>>> -	vmcs_write32(VM_EXIT_CONTROLS, -		vmcs12->vm_exit_controls |
>>>>>>>> vmcs_config.vmexit_ctrl); -	vmcs_write32(VM_ENTRY_CONTROLS,
>>>>>>>> vmcs12->vm_entry_controls | +	/* L2->L1 exit controls are
>>>>>>>> emulated - the hardware exit is +to L0 so +	 * we should use its
>>>>>>>> exit controls. Note that IA32_MODE, LOAD_IA32_EFER +	 * bits are
>>>>>>>> further modified by vmx_set_efer() below. +	 */
>>>>>>>> +	vmcs_write32(VM_EXIT_CONTROLS, vmcs_config.vmexit_ctrl);
>>>>>> This is wrong. We cannot use L0 exit control directly.
>>>>>> LOAD_PERF_GLOBAL_CTRL, LOAD_HOST_EFE, LOAD_HOST_PAT,
> ACK_INTR_ON_EXIT should use host's exit control. But others, still
> need use (vmcs12|host).
>>>>>> 
>>>>> I do not see why. We always intercept DR7/PAT/EFER, so save is
>>>>> emulated too. Host address space size always come from L0 and
>>>>> preemption timer is not supported for nested IIRC and when it
>>>>> will be host will have to save it on exit anyway for correct emulation.
>>>> 
>>>> Preemption timer is already supported and works fine as far as I tested.
>>>> KVM doesn't use it for L1, so we do not need to save/restore it - IIRC.
>>>> 
>>> So what happens if L1 configures it to value X after X/2 ticks L0
>>> exit happen and L0 gets back to L2 directly. The counter will be X
>>> again instead of X/2.
>> 
>> Likely. Yes, we need to improve our emulation by setting "Save
>> VMX-preemption timer value" or emulate this in software if the
>> hardware lacks support for it (was this flag introduced after the
>> preemption timer itself?).
>> 
> Not sure, but my point was that for correct emulation host needs to
> set "save preempt timer on vmexit" anyway so all VM_EXIT_CONTROLS are
> indeed emulated as far as I see.
>
Ok, here is my summary, please correct me if I am wrong:
bit 2: Save debug controls, the first processor only support 1-setting on it, so just use host's setting is enough
bit 9: Host address space size, it indicate the host's state, so must use host's setting.
bit 12: Load IA32_PERF_GLOBAL_CTRL: same as above.
bit 15 : Acknowledge interrupt on exit: same as above.
bit 19: Load IA32_PAT: same as above.
bit 20: Load IA32_EFER: same as above.
bit 18: Save IA32_PAT, Didn't expose it to L1, so use host' setting is ok.
bit 19: Save IA32_EFER, same as above.
bit 22: Save VMXpreemption timer value, I don't see KVM expose it to L1, but Jan said it's working. Strange! And according gleb's suggestion, it better to always set it.

So, currently, only use host' exit_control for L2 is enough.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux