Re: [PATCH] KVM: nVMX: Fully support of nested VMX preemption timer

Jan Kiszka <jan.kiszka@xxxxxx> · Sun, 25 Aug 2013 10:27:22 +0200

On 2013-08-25 10:25, Jan Kiszka wrote:
> On 2013-08-25 10:18, Abel Gordon wrote:
>>
>>
>> kvm-owner@xxxxxxxxxxxxxxx wrote on 25/08/2013 10:54:13 AM:
>>
>>> From: Jan Kiszka <jan.kiszka@xxxxxx>
>>> To: Abel Gordon/Haifa/IBM@IBMIL,
>>> Cc: gleb@xxxxxxxxxx, kvm <kvm@xxxxxxxxxxxxxxx>, pbonzini@xxxxxxxxxx,
>>> "李春奇 <Arthur Chunqi Li>"  <yzt356@xxxxxxxxx>
>>> Date: 25/08/2013 10:54 AM
>>> Subject: Re: [PATCH] KVM: nVMX: Fully support of nested VMX preemption
>> timer
>>> Sent by: kvm-owner@xxxxxxxxxxxxxxx
>>>
>>> On 2013-08-25 09:50, Abel Gordon wrote:
>>>>
>>>>
>>>> kvm-owner@xxxxxxxxxxxxxxx wrote on 25/08/2013 10:43:12 AM:
>>>>
>>>>> From: Jan Kiszka <jan.kiszka@xxxxxx>
>>>>> To: Abel Gordon/Haifa/IBM@IBMIL,
>>>>> Cc: gleb@xxxxxxxxxx, kvm@xxxxxxxxxxxxxxx, kvm-owner@xxxxxxxxxxxxxxx,
>>>>> pbonzini@xxxxxxxxxx, "李春奇 <Arthur Chunqi Li>" <yzt356@xxxxxxxxx>
>>>>> Date: 25/08/2013 10:43 AM
>>>>> Subject: Re: [PATCH] KVM: nVMX: Fully support of nested VMX preemption
>>>> timer
>>>>> Sent by: kvm-owner@xxxxxxxxxxxxxxx
>>>>>
>>>>> On 2013-08-25 09:37, Abel Gordon wrote:
>>>>>>
>>>>>>
>>>>>>> From: Jan Kiszka <jan.kiszka@xxxxxx>
>>>>>>> To: "李春奇 <Arthur Chunqi Li>"  <yzt356@xxxxxxxxx>,
>>>>>>> Cc: kvm@xxxxxxxxxxxxxxx, gleb@xxxxxxxxxx, pbonzini@xxxxxxxxxx
>>>>>>> Date: 25/08/2013 09:44 AM
>>>>>>> Subject: Re: [PATCH] KVM: nVMX: Fully support of nested VMX
>> preemption
>>>>>> timer
>>>>>>> Sent by: kvm-owner@xxxxxxxxxxxxxxx
>>>>>>>
>>>>>>> On 2013-08-24 20:44, root wrote:
>>>>>>>> This patch contains the following two changes:
>>>>>>>> 1. Fix the bug in nested preemption timer support. If vmexit L2->L0
>>>>>>>> with some reasons not emulated by L1, preemption timer value should
>>>>>>>> be save in such exits.
>>>>>>>> 2. Add support of "Save VMX-preemption timer value" VM-Exit
>> controls
>>>>>>>> to nVMX.
>>>>>>>>
>>>>>>>> With this patch, nested VMX preemption timer features are fully
>>>>>>>> supported.
>>>>>>>>
>>>>>>>> Signed-off-by: Arthur Chunqi Li <yzt356@xxxxxxxxx>
>>>>>>>> ---
>>>>>>
>>>>>>>>
>>>>>>>> @@ -7578,9 +7579,14 @@ static void prepare_vmcs02(struct kvm_vcpu
>>>>>>> *vcpu, struct vmcs12 *vmcs12)
>>>>>>>>        (vmcs_config.pin_based_exec_ctrl |
>>>>>>>>         vmcs12->pin_based_vm_exec_control));
>>>>>>>>
>>>>>>>> -   if (vmcs12->pin_based_vm_exec_control &
>>>>>> PIN_BASED_VMX_PREEMPTION_TIMER)
>>>>>>>> -      vmcs_write32(VMX_PREEMPTION_TIMER_VALUE,
>>>>>>>> -              vmcs12->vmx_preemption_timer_value);
>>>>>>>> +   if (vmcs12->pin_based_vm_exec_control &
>>>>>>> PIN_BASED_VMX_PREEMPTION_TIMER) {
>>>>>>>> +      if (vmcs12->vm_exit_controls &
>>>>>> VM_EXIT_SAVE_VMX_PREEMPTION_TIMER)
>>>>>>>> +         vmcs12->vmx_preemption_timer_value =
>>>>>>>> +            vmcs_read32(VMX_PREEMPTION_TIMER_VALUE);
>>>>>>>> +      else
>>>>>>>> +         vmcs_write32(VMX_PREEMPTION_TIMER_VALUE,
>>>>>>>> +               vmcs12->vmx_preemption_timer_value);
>>>>>>>> +   }
>>>>>>>
>>>>>>> This is not correct. We still need to set the vmcs to
>>>>>>> vmx_preemption_timer_value. The difference is that, on exit from L2,
>>>>>>> vmx_preemption_timer_value has to be updated according to the saved
>>>>>>> hardware state. The corresponding code is missing in your patch so
>>>> far.
>>>>>>
>>>>>> I think something else maybe be missing here: assuming L0 handles
>> exits
>>>>>> for L2 without involving L1 (e.g. external interrupts or ept
>>>> violations),
>>>>>> then, we may spend some cycles in L0 handling these exits. Note L1 is
>>>> not
>>>>>> aware of these exits and from L1 perspective L2 was running on the
>> CPU.
>>>>>> That means that we may need to reduce these cycles spent at
>>>>>> L0 from the preemtion timer or emulate a preemption timer exit to
>>>>>> force a transition to L1 instead of resuming L2.
>>>>>
>>>>> That's precisely what the logic I described should achieve: reload the
>>>>> value we saved on L2 exit on reentry.
>>>>
>>>> But don't you think we should also reduce the cycles spent at L0 from
>> the
>>>> preemption timer ? I mean, if we spent X cycles at L0 handling a L2
>> exit
>>>> which was not forwarded to L1, then, before we resume L2,
>>>> the preemption timer should be: (previous_value_on_exit - X).
>>>> If (previous_value_on_exit - X) < 0, then we should force ("emulate") a
>>>> preemption timer exit between L2 and L1.
>>>
>>> We ask the hardware to save the value of the preemption on L2 exit. This
>>> value will be exposed to L1 (if it asked for saving as well) and/or be
>>> written back to the hardware on L2 reenty (unless L1 had a chance to run
>>> and modified it). So the time spent in L0 is implicitly subtracted.
>>
>> I think you are suggesting the following, please correct me if I am wrong.
>> 1) L1 resumes L2 with preemption timer enabled
>> 2) L0 emulates the resume/launch
>> 3) L2 runs for Y cycles until an external interrupt occurs (Y < preemption
>> timer specified by L1)
>> 4) L0 saved the preemption timer (original value - Y)
>> 5) L0 spends X cycles handling the external interrupt
>> 6) L0 resumes L2 with preemption timer = original value - Y
>>
>> Note that in this case "X is ignored".
> 
> Yes, but see my other reply.
> 
>>
>> I was suggesting to do the following:
>> 6) If original value - Y - X > 0 then
>>  L0 resumes L2 with preemption timer = original value - Y - X
>> else
>>  L0 emulates a L2->L1 preemption timer exit (resumes L1)
> 
> Almost . 6) should be:
> If exit to L1 occurred after last L2, set X to 0. Then load MAX(original
> value - Y - X, 0).

Hmm, no:

If exit to L1 occurred after last L2, load value of vmcs12, else load
MAX(original
value - Y - X, 0).

Jan

Attachment:
signature.asc

Description: OpenPGP digital signature