On Wed, Oct 28, 2009 at 06:23:42PM +0200, Orit Wasserman wrote: > > > Gleb Natapov <gleb@xxxxxxxxxx> wrote on 25/10/2009 11:44:31: > > > From: > > > > Gleb Natapov <gleb@xxxxxxxxxx> > > > > To: > > > > Orit Wasserman/Haifa/IBM@IBMIL > > > > Cc: > > > > Abel Gordon/Haifa/IBM@IBMIL, aliguori@xxxxxxxxxx, Ben-Ami Yassour1/ > > Haifa/IBM@IBMIL, kvm@xxxxxxxxxxxxxxx, mdday@xxxxxxxxxx, Muli Ben- > > Yehuda/Haifa/IBM@IBMIL > > > > Date: > > > > 25/10/2009 11:44 > > > > Subject: > > > > Re: [PATCH 5/5] Nested VMX patch 5 implements vmlaunch and vmresume > > > > On Thu, Oct 22, 2009 at 05:46:16PM +0200, Orit Wasserman wrote: > > > > > > > > > Gleb Natapov <gleb@xxxxxxxxxx> wrote on 22/10/2009 11:04:58: > > > > > > > From: > > > > > > > > Gleb Natapov <gleb@xxxxxxxxxx> > > > > > > > > To: > > > > > > > > Orit Wasserman/Haifa/IBM@IBMIL > > > > > > > > Cc: > > > > > > > > Abel Gordon/Haifa/IBM@IBMIL, aliguori@xxxxxxxxxx, Ben-Ami Yassour1/ > > > > Haifa/IBM@IBMIL, kvm@xxxxxxxxxxxxxxx, mdday@xxxxxxxxxx, Muli Ben- > > > > Yehuda/Haifa/IBM@IBMIL > > > > > > > > Date: > > > > > > > > 22/10/2009 11:05 > > > > > > > > Subject: > > > > > > > > Re: [PATCH 5/5] Nested VMX patch 5 implements vmlaunch and vmresume > > > > > > > > On Wed, Oct 21, 2009 at 04:43:44PM +0200, Orit Wasserman wrote: > > > > > > > @@ -4641,10 +4955,13 @@ static void vmx_complete_interrupts > (struct > > > > > > vcpu_vmx *vmx) > > > > > > > int type; > > > > > > > bool idtv_info_valid; > > > > > > > > > > > > > > - exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO); > > > > > > > - > > > > > > > vmx->exit_reason = vmcs_read32(VM_EXIT_REASON); > > > > > > > > > > > > > > + if (vmx->nested.nested_mode) > > > > > > > + return; > > > > > > > + > > > > > > Why return here? What the function does that should not be done > in > > > > > > nested mode? > > > > > In nested mode L0 injects an interrupt to L2 only in one scenario, > > > > > if there is an IDT_VALID event and L0 decides to run L2 again and > not > > > to > > > > > switch back to L1. > > > > > In all other cases the injection is handled by L1. > > > > This is exactly the kind of scenario that is handled by > > > > vmx_complete_interrupts(). (vmx|svm)_complete_interrups() store > > > > pending event in arch agnostic way and re-injection is handled by > > > > x86.c You bypass this logic by inserting return here and introducing > > > > nested_handle_valid_idt() function below. > > > The only location we can truly know if we are switching to L1 is in > > > vmx_vcpu_run > > > because enable_irq_window (that is called after handling the exit) can > > > decide to > > > switch to L1 because of an interrupt. > > enable_irq_window() will be called after L2 VMCS will be setup for event > > re-injection by previous call to inject_pending_event(). As far as I > > can see this should work for interrupt injection. For exception we > > should probably require l2 guest to re execute faulted instruction for > > now like svm does. > The main issue is that L0 doesn't inject events to L2 but L1 hypervisor (we > want to keep the nested hypervisor semantics as > much as possible). Only if the event was caused by the fact that L2 is a > nested guest > and L1 can't handle it L0 will re-inject and event to L2, for example IDT > event > with page fault that is caused by a missing entry in SPT02 (the shadow page > table L0 create for L2). > In this case when vmx_complete_intterupts is called L0 doesn't know if the > page fault should be handled by it or > by L1 (it is decided later when handling the exit). So what? When it will be decided that L2 exit is needed pending event will be transfered into L2's idt_vectoring_info. Otherwise event will be reinfected by usual mechanism. BTW I don't see where you current code setup L2's idt_vectoring_info if it is decided that L1 should handle event reinjection. > In most other cases , L0 will switch to L1 and L1 will decide if there will > be re-injection > (depends on the L1 hypervisor logic) and update L2 VMCS accordingly. > > > > > In order to simplify our code it was simpler to bypass > > > vmx_complete_interrupts when it is called (after > > > running L2) and to add nested_handle_valid_idt just before running L2. > > > > > > > > > > > > > + exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO); > > > > > > > + > > > > > > > /* Handle machine checks before interrupts are enabled */ > > > > > > > if ((vmx->exit_reason == EXIT_REASON_MCE_DURING_VMENTRY) > > > > > > > || (vmx->exit_reason == EXIT_REASON_EXCEPTION_NMI > > > > > > > @@ -4747,6 +5064,60 @@ static void fixup_rmode_irq(struct > vcpu_vmx > > > > > *vmx) > > > > > > > | vmx->rmode.irq.vector; > > > > > > > } > > > > > > > > > > > > > > +static int nested_handle_valid_idt(struct kvm_vcpu *vcpu) > > > > > > > +{ > > > > > > It seems by this function you are trying to bypass general event > > > > > > reinjection logic. Why? > > > > > See above. > > > > The logic implemented by this function is handled in x86.c in arch > > > > agnostic way. Is there something wrong with this? > > > See my comment before > > Sometimes it is wrong to reinject events from L0 to L2 directly. If L2 > > was not able to handle event because its IDT is not mapped by L1 shadow > > page table we should generate PF vmexit with valid idt vectoring info to > > L1 and let L1 handle event reinjection. > > > > -- > > Gleb. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html