On 2013-08-25 10:04, Abel Gordon wrote: > > > kvm-owner@xxxxxxxxxxxxxxx wrote on 25/08/2013 10:55:24 AM: > >> From: Arthur Chunqi Li <yzt356@xxxxxxxxx> >> To: Abel Gordon/Haifa/IBM@IBMIL, >> Cc: Jan Kiszka <jan.kiszka@xxxxxx>, Gleb Natapov <gleb@xxxxxxxxxx>, >> kvm <kvm@xxxxxxxxxxxxxxx>, kvm-owner@xxxxxxxxxxxxxxx, Paolo Bonzini >> <pbonzini@xxxxxxxxxx> >> Date: 25/08/2013 10:55 AM >> Subject: Re: [PATCH] KVM: nVMX: Fully support of nested VMX preemption > timer >> Sent by: kvm-owner@xxxxxxxxxxxxxxx >> >> On Sun, Aug 25, 2013 at 3:50 PM, Abel Gordon <ABELG@xxxxxxxxxx> wrote: >>> >>> >>> kvm-owner@xxxxxxxxxxxxxxx wrote on 25/08/2013 10:43:12 AM: >>> >>>> From: Jan Kiszka <jan.kiszka@xxxxxx> >>>> To: Abel Gordon/Haifa/IBM@IBMIL, >>>> Cc: gleb@xxxxxxxxxx, kvm@xxxxxxxxxxxxxxx, kvm-owner@xxxxxxxxxxxxxxx, >>>> pbonzini@xxxxxxxxxx, "李春奇 <Arthur Chunqi Li>" <yzt356@xxxxxxxxx> >>>> Date: 25/08/2013 10:43 AM >>>> Subject: Re: [PATCH] KVM: nVMX: Fully support of nested VMX preemption >>> timer >>>> Sent by: kvm-owner@xxxxxxxxxxxxxxx >>>> >>>> On 2013-08-25 09:37, Abel Gordon wrote: >>>>> >>>>> >>>>>> From: Jan Kiszka <jan.kiszka@xxxxxx> >>>>>> To: "李春奇 <Arthur Chunqi Li>" <yzt356@xxxxxxxxx>, >>>>>> Cc: kvm@xxxxxxxxxxxxxxx, gleb@xxxxxxxxxx, pbonzini@xxxxxxxxxx >>>>>> Date: 25/08/2013 09:44 AM >>>>>> Subject: Re: [PATCH] KVM: nVMX: Fully support of nested VMX > preemption >>>>> timer >>>>>> Sent by: kvm-owner@xxxxxxxxxxxxxxx >>>>>> >>>>>> On 2013-08-24 20:44, root wrote: >>>>>>> This patch contains the following two changes: >>>>>>> 1. Fix the bug in nested preemption timer support. If vmexit L2-> > L0 >>>>>>> with some reasons not emulated by L1, preemption timer value > should >>>>>>> be save in such exits. >>>>>>> 2. Add support of "Save VMX-preemption timer value" VM-Exit > controls >>>>>>> to nVMX. >>>>>>> >>>>>>> With this patch, nested VMX preemption timer features are fully >>>>>>> supported. >>>>>>> >>>>>>> Signed-off-by: Arthur Chunqi Li <yzt356@xxxxxxxxx> >>>>>>> --- >>>>> >>>>>>> >>>>>>> @@ -7578,9 +7579,14 @@ static void prepare_vmcs02(struct kvm_vcpu >>>>>> *vcpu, struct vmcs12 *vmcs12) >>>>>>> (vmcs_config.pin_based_exec_ctrl | >>>>>>> vmcs12->pin_based_vm_exec_control)); >>>>>>> >>>>>>> - if (vmcs12->pin_based_vm_exec_control & >>>>> PIN_BASED_VMX_PREEMPTION_TIMER) >>>>>>> - vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, >>>>>>> - vmcs12->vmx_preemption_timer_value); >>>>>>> + if (vmcs12->pin_based_vm_exec_control & >>>>>> PIN_BASED_VMX_PREEMPTION_TIMER) { >>>>>>> + if (vmcs12->vm_exit_controls & >>>>> VM_EXIT_SAVE_VMX_PREEMPTION_TIMER) >>>>>>> + vmcs12->vmx_preemption_timer_value = >>>>>>> + vmcs_read32(VMX_PREEMPTION_TIMER_VALUE); >>>>>>> + else >>>>>>> + vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, >>>>>>> + vmcs12->vmx_preemption_timer_value); >>>>>>> + } >>>>>> >>>>>> This is not correct. We still need to set the vmcs to >>>>>> vmx_preemption_timer_value. The difference is that, on exit from > L2, >>>>>> vmx_preemption_timer_value has to be updated according to the saved >>>>>> hardware state. The corresponding code is missing in your patch so >>> far. >>>>> >>>>> I think something else maybe be missing here: assuming L0 handles > exits >>>>> for L2 without involving L1 (e.g. external interrupts or ept >>> violations), >>>>> then, we may spend some cycles in L0 handling these exits. Note L1 > is >>> not >>>>> aware of these exits and from L1 perspective L2 was running on the > CPU. >>>>> That means that we may need to reduce these cycles spent at >>>>> L0 from the preemtion timer or emulate a preemption timer exit to >>>>> force a transition to L1 instead of resuming L2. >>>> >>>> That's precisely what the logic I described should achieve: reload the >>>> value we saved on L2 exit on reentry. >>> >>> But don't you think we should also reduce the cycles spent at L0 from > the >>> preemption timer ? I mean, if we spent X cycles at L0 handling a L2 > exit >>> which was not forwarded to L1, then, before we resume L2, >>> the preemption timer should be: (previous_value_on_exit - X). >>> If (previous_value_on_exit - X) < 0, then we should force ("emulate") a >>> preemption timer exit between L2 and L1. >> Sorry, I previously misunderstand your comments. But why should we >> need to exclude cycles in L0 from L2 preemption value? These cycles >> are not spent by L2 and it should not be on L2. > > L1 asked the "hardware" (emulated by L0) to run L2 and force an exit > after "Y" cycles. Now, in practice, we may spend "X" cycles at L0 handling > exits without switching to L1. That means that from L1 perspective L2 > was running all these X cycles. L1 should assume that the instructions per > cycle > the CPU executed decreased but the cycles were spent. That's why I believe > you should take in account these X cycles. > Now I get it. There is likely some truth in this as the reference clock for the preemption timer, the TSC, isn't stopped for L1/L2 while running in L0. And the SDM demands the countdown to be proportional to that clock. Jan
Attachment:
signature.asc
Description: OpenPGP digital signature