kvm-owner@xxxxxxxxxxxxxxx wrote on 25/08/2013 10:54:13 AM: > From: Jan Kiszka <jan.kiszka@xxxxxx> > To: Abel Gordon/Haifa/IBM@IBMIL, > Cc: gleb@xxxxxxxxxx, kvm <kvm@xxxxxxxxxxxxxxx>, pbonzini@xxxxxxxxxx, > "李春奇 <Arthur Chunqi Li>" <yzt356@xxxxxxxxx> > Date: 25/08/2013 10:54 AM > Subject: Re: [PATCH] KVM: nVMX: Fully support of nested VMX preemption timer > Sent by: kvm-owner@xxxxxxxxxxxxxxx > > On 2013-08-25 09:50, Abel Gordon wrote: > > > > > > kvm-owner@xxxxxxxxxxxxxxx wrote on 25/08/2013 10:43:12 AM: > > > >> From: Jan Kiszka <jan.kiszka@xxxxxx> > >> To: Abel Gordon/Haifa/IBM@IBMIL, > >> Cc: gleb@xxxxxxxxxx, kvm@xxxxxxxxxxxxxxx, kvm-owner@xxxxxxxxxxxxxxx, > >> pbonzini@xxxxxxxxxx, "李春奇 <Arthur Chunqi Li>" <yzt356@xxxxxxxxx> > >> Date: 25/08/2013 10:43 AM > >> Subject: Re: [PATCH] KVM: nVMX: Fully support of nested VMX preemption > > timer > >> Sent by: kvm-owner@xxxxxxxxxxxxxxx > >> > >> On 2013-08-25 09:37, Abel Gordon wrote: > >>> > >>> > >>>> From: Jan Kiszka <jan.kiszka@xxxxxx> > >>>> To: "李春奇 <Arthur Chunqi Li>" <yzt356@xxxxxxxxx>, > >>>> Cc: kvm@xxxxxxxxxxxxxxx, gleb@xxxxxxxxxx, pbonzini@xxxxxxxxxx > >>>> Date: 25/08/2013 09:44 AM > >>>> Subject: Re: [PATCH] KVM: nVMX: Fully support of nested VMX preemption > >>> timer > >>>> Sent by: kvm-owner@xxxxxxxxxxxxxxx > >>>> > >>>> On 2013-08-24 20:44, root wrote: > >>>>> This patch contains the following two changes: > >>>>> 1. Fix the bug in nested preemption timer support. If vmexit L2->L0 > >>>>> with some reasons not emulated by L1, preemption timer value should > >>>>> be save in such exits. > >>>>> 2. Add support of "Save VMX-preemption timer value" VM-Exit controls > >>>>> to nVMX. > >>>>> > >>>>> With this patch, nested VMX preemption timer features are fully > >>>>> supported. > >>>>> > >>>>> Signed-off-by: Arthur Chunqi Li <yzt356@xxxxxxxxx> > >>>>> --- > >>> > >>>>> > >>>>> @@ -7578,9 +7579,14 @@ static void prepare_vmcs02(struct kvm_vcpu > >>>> *vcpu, struct vmcs12 *vmcs12) > >>>>> (vmcs_config.pin_based_exec_ctrl | > >>>>> vmcs12->pin_based_vm_exec_control)); > >>>>> > >>>>> - if (vmcs12->pin_based_vm_exec_control & > >>> PIN_BASED_VMX_PREEMPTION_TIMER) > >>>>> - vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, > >>>>> - vmcs12->vmx_preemption_timer_value); > >>>>> + if (vmcs12->pin_based_vm_exec_control & > >>>> PIN_BASED_VMX_PREEMPTION_TIMER) { > >>>>> + if (vmcs12->vm_exit_controls & > >>> VM_EXIT_SAVE_VMX_PREEMPTION_TIMER) > >>>>> + vmcs12->vmx_preemption_timer_value = > >>>>> + vmcs_read32(VMX_PREEMPTION_TIMER_VALUE); > >>>>> + else > >>>>> + vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, > >>>>> + vmcs12->vmx_preemption_timer_value); > >>>>> + } > >>>> > >>>> This is not correct. We still need to set the vmcs to > >>>> vmx_preemption_timer_value. The difference is that, on exit from L2, > >>>> vmx_preemption_timer_value has to be updated according to the saved > >>>> hardware state. The corresponding code is missing in your patch so > > far. > >>> > >>> I think something else maybe be missing here: assuming L0 handles exits > >>> for L2 without involving L1 (e.g. external interrupts or ept > > violations), > >>> then, we may spend some cycles in L0 handling these exits. Note L1 is > > not > >>> aware of these exits and from L1 perspective L2 was running on the CPU. > >>> That means that we may need to reduce these cycles spent at > >>> L0 from the preemtion timer or emulate a preemption timer exit to > >>> force a transition to L1 instead of resuming L2. > >> > >> That's precisely what the logic I described should achieve: reload the > >> value we saved on L2 exit on reentry. > > > > But don't you think we should also reduce the cycles spent at L0 from the > > preemption timer ? I mean, if we spent X cycles at L0 handling a L2 exit > > which was not forwarded to L1, then, before we resume L2, > > the preemption timer should be: (previous_value_on_exit - X). > > If (previous_value_on_exit - X) < 0, then we should force ("emulate") a > > preemption timer exit between L2 and L1. > > We ask the hardware to save the value of the preemption on L2 exit. This > value will be exposed to L1 (if it asked for saving as well) and/or be > written back to the hardware on L2 reenty (unless L1 had a chance to run > and modified it). So the time spent in L0 is implicitly subtracted. I think you are suggesting the following, please correct me if I am wrong. 1) L1 resumes L2 with preemption timer enabled 2) L0 emulates the resume/launch 3) L2 runs for Y cycles until an external interrupt occurs (Y < preemption timer specified by L1) 4) L0 saved the preemption timer (original value - Y) 5) L0 spends X cycles handling the external interrupt 6) L0 resumes L2 with preemption timer = original value - Y Note that in this case "X is ignored". I was suggesting to do the following: 6) If original value - Y - X > 0 then L0 resumes L2 with preemption timer = original value - Y - X else L0 emulates a L2->L1 preemption timer exit (resumes L1) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html