Re: [PATCH 5/5] Nested VMX patch 5 implements vmlaunch and vmresume

Orit Wasserman <oritw@xxxxxxxxxx> · Thu, 22 Oct 2009 17:46:16 +0200

Gleb Natapov <gleb@xxxxxxxxxx> wrote on 22/10/2009 11:04:58:

> From:
>
> Gleb Natapov <gleb@xxxxxxxxxx>
>
> To:
>
> Orit Wasserman/Haifa/IBM@IBMIL
>
> Cc:
>
> Abel Gordon/Haifa/IBM@IBMIL, aliguori@xxxxxxxxxx, Ben-Ami Yassour1/
> Haifa/IBM@IBMIL, kvm@xxxxxxxxxxxxxxx, mdday@xxxxxxxxxx, Muli Ben-
> Yehuda/Haifa/IBM@IBMIL
>
> Date:
>
> 22/10/2009 11:05
>
> Subject:
>
> Re: [PATCH 5/5] Nested VMX patch 5 implements vmlaunch and vmresume
>
> On Wed, Oct 21, 2009 at 04:43:44PM +0200, Orit Wasserman wrote:
> > > > @@ -4641,10 +4955,13 @@ static void vmx_complete_interrupts(struct
> > > vcpu_vmx *vmx)
> > > >     int type;
> > > >     bool idtv_info_valid;
> > > >
> > > > -   exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
> > > > -
> > > >     vmx->exit_reason = vmcs_read32(VM_EXIT_REASON);
> > > >
> > > > +   if (vmx->nested.nested_mode)
> > > > +      return;
> > > > +
> > > Why return here? What the function does that should not be done in
> > > nested mode?
> > In nested mode L0 injects an interrupt to L2 only in one scenario,
> > if there is an IDT_VALID event and L0 decides to run L2 again and not
to
> > switch back to L1.
> > In all other cases the injection is handled by L1.
> This is exactly the kind of scenario that is handled by
> vmx_complete_interrupts(). (vmx|svm)_complete_interrups() store
> pending event in arch agnostic way and re-injection is handled by
> x86.c You bypass this logic by inserting return here and introducing
> nested_handle_valid_idt() function below.
The only location we can truly know if we are switching to L1 is in
vmx_vcpu_run
because enable_irq_window (that is called after handling the exit) can
decide to
switch to L1 because of an interrupt.
In order to simplify our code it was simpler to bypass
vmx_complete_interrupts when it is called (after
running L2) and to add nested_handle_valid_idt just before running L2.
> > >
> > > > +   exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
> > > > +
> > > >     /* Handle machine checks before interrupts are enabled */
> > > >     if ((vmx->exit_reason == EXIT_REASON_MCE_DURING_VMENTRY)
> > > >         || (vmx->exit_reason == EXIT_REASON_EXCEPTION_NMI
> > > > @@ -4747,6 +5064,60 @@ static void fixup_rmode_irq(struct vcpu_vmx
> > *vmx)
> > > >        | vmx->rmode.irq.vector;
> > > >  }
> > > >
> > > > +static int nested_handle_valid_idt(struct kvm_vcpu *vcpu)
> > > > +{
> > > It seems by this function you are trying to bypass general event
> > > reinjection logic. Why?
> > See above.
> The logic implemented by this function is handled in x86.c in arch
> agnostic way. Is there something wrong with this?
See my comment before
>
> > > > +   vmx->launched = vmx->nested.l2_state->launched;
> > > > +
> > > Can you explain why ->launched logic is needed?
> > It is possible L1 called vmlaunch but we didn't actually run L2 (for
> > example there was an interrupt and
> > enable_irq_window switched back to L1 before running L2). L1 thinks the
> > vmlaunch was successful and call vmresume in the next time
> > but KVM needs to call vmlaunch for L2.
> handle_vmlauch() and handle_vmresume() are exactly the same. Why KVM
needs
> to run one and not the other?
Yes they are very similar (almost the same code) the only difference is the
check of vmclear,
we need to emulate the vmx hardware behavior for those two commands and
check VMC12 state.
>
> > > > +static int nested_vmx_vmexit(struct kvm_vcpu *vcpu,
> > > > +              bool is_interrupt)
> > > > +{
> > > > +   struct vcpu_vmx *vmx = to_vmx(vcpu);
> > > > +   int initial_pfu_active = vcpu->fpu_active;
> > > > +
> > > > +   if (!vmx->nested.nested_mode) {
> > > > +      printk(KERN_INFO "WARNING: %s called but not in nested mode
\n",
> > > > +             __func__);
> > > > +      return 0;
> > > > +   }
> > > > +
> > > > +   save_msrs(vmx->guest_msrs, vmx->save_nmsrs);
> > > > +
> > > > +   sync_cached_regs_to_vmcs(vcpu);
> > > > +
> > > > +   if (!nested_map_shadow_vmcs(vcpu)) {
> > > > +      printk(KERN_INFO "Error mapping shadow vmcs\n");
> > > > +      set_rflags_to_vmx_fail_valid(vcpu);
> > > Error during vmexit should set abort flag, not change flags.
> > I think this is more a vmlaunch/vmresume error (in the code that switch
> > back to L1).
> How is this vmlaunch/vmresume error? This function is called to exit
> from L2 guest while on L2 vcms. It is called asynchronously in respect
> to L2 guest and you modify L2 guest rflags register at unpredictable
> place here.
OK.
>
> --
>          Gleb.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html