Re: [PATCH 3/6] Nested VMX patch 3 implements vmptrld and vmptrst

Orit Wasserman <oritw@xxxxxxxxxx> · Sun, 6 Sep 2009 16:36:26 +0300

Avi Kivity <avi@xxxxxxxxxx> wrote on 06/09/2009 12:25:17:

> From:
>
> Avi Kivity <avi@xxxxxxxxxx>
>
> To:
>
> Orit Wasserman/Haifa/IBM@IBMIL
>
> Cc:
>
> Abel Gordon/Haifa/IBM@IBMIL, aliguori@xxxxxxxxxx, Ben-Ami Yassour1/
> Haifa/IBM@IBMIL, kvm@xxxxxxxxxxxxxxx, mmday@xxxxxxxxxx, Muli Ben-
> Yehuda/Haifa/IBM@IBMIL
>
> Date:
>
> 06/09/2009 12:25
>
> Subject:
>
> Re: [PATCH 3/6] Nested VMX patch 3 implements vmptrld and vmptrst
>
> On 09/03/2009 05:25 PM, Orit Wasserman wrote:
> >>
> >>> +   /*
> >>> +    * Level 2 state : includes vmcs,registers and
> >>> +    * a copy of vmcs12 for vmread/vmwrite
> >>> +    */
> >>> +   struct level_state *l2_state;
> >>> +
> >>> +   /* Level 1 state for switching to level 2 and back */
> >>> +   struct level_state *l1_state;
> >>>
> >>>
> >> Can you explain why we need two of them?  in the guest vmcs we have
host
> >> and guest values, and in l1_state and l2_state we have more copies,
and
> >> in struct vcpu we have yet another set of copies.  We also have a
couple
> >> of copies in the host vmcs.  I'm getting dizzy...
> >>
> > L2_state stores all the L2 guest state:
> >        vmcs - A pointer to VMCS02, the VMCS used to run it by L0.
> >        shadow vmcs - a structure storing the values of VMCS12 (the vmcs
L1
> > create to run L2).
> >
>
> When we support multiple nested guests, we'll run into a problem of
> where to store shadow_vmcs.  I see these options:
>
> - maintain a cache of limited size of shadow_vmcs; when evicting, copy
> the shadow_vmcs into the guest's vmptr]
> - always put shadow_vmcs in the guest's vmptr, and write protect it so
> the guest can't play with it
> - always put shadow_vmcs in the guest's vmptr, and verify everything you
> read (that's what nsvm does)
>
The second option looks a bit complicated I prefer one of the other two.
> >        cpu - the cpu id
> >
>
> Why is it needed?
This is a copy of the cpu id from the vcpu to store the last cpu id the L2
guest run on.
>
> >        launched- launched flag
> >
>
> Can be part of shadow_vmcs
I prefer to keep the shadow_vmcs as a separate structure to store only VMCS
fields.
>
> >        vpid - the vpid allocate by L0 for L2 (we need to store it
somewhere)
> >
>
> Note the guest can DoS the host by allocating a lot of vpids.  So we to
> allocate host vpids on demand and be able to flush them out.
The guest is not allocating the vpids the host(L0) does using
allocate_vpid.
I agree that with nested the danger for them to run out is bigger.
>
> >        msr_bitmap - At the moment we use L0 msr_bitmap(as we are
running kvm
> > on kvm) in the future we will use a merge of both bitmaps.
> >
>
> Note kvm uses two bitmaps (for long mode and legacy mode).
OK.
>
> > L1 state stores the L1 state -
> >        vmcs - pointer to VMCS01
> >
>
> So it's the same as vmx->vmcs in normal operation?
yes , but with nested the vmx->vmcs is changed when running an L2 (nested)
guest.
>
> >        shadow vmcs - a structure storing the values of VMCS01. we use
it
> > when updating VMCS02 in order to avoid the need to switch between
VMCS02
> > and VMCS01.
> >
>
> Sorry, don't understand.
VMCS02 - the VMCS L0 uses to run L2.
When we create/update VMCS02 we need to read fields from VMCS01 (host state
is taken fully, control fields ).
For L1 the shadow_vmcs is a copy of VMCS01 in a structure format, we used
the same structure.
>
> >        cpu - the cpu id
> >        launched- launched flag
> >
>
> This is a copy of vmx->launched?
Exactly .The vmx->launched is updated when switching from L1/L2 and back so
we need to store it here.
>
> >>> +
> >>> +   if (vmx->nested.l1_cur_vmcs != guest_vmcs_addr) {
> >>> +      vmcs_page = nested_get_page(vcpu, guest_vmcs_addr);
> >>> +      if (vmcs_page == NULL)
> >>> +         return 1;
> >>> +
> >>> +      /* load nested vmcs to processor */
> >>> +      if (vmptrld(vcpu, page_to_phys(vmcs_page))) {
> >>>
> >>>
> >> So, you're loading a guest page as the vmcs.  This is dangerous as the
> >> guest can play with it.  Much better to use inaccessible memory (and
you
> >> do alloc_vmcs() earlier?)
> >>
> > We can copy the vmcs and than vmptrld it. As for the allocate vmcs this
is
> > a memory leak and I will fix it (it should be allocated only once).
> >
>
> But why do it?  Your approach is to store the guest vmcs in the same
> format as the processor (which we don't really know), so you have to use
> vmread/vmwrite to maintain it.  Instead, you can choose that the guest
> vmcs is a shadow_vmcs structure and then you can access it using normal
> memory operations.
I got it now.
We will need a way to distinguish between processor format VMCS and
structure based VMCS,
we can use the revision id field (create a unique revision id for nested
like 0xffff or 0x0).
>
> >> I see.  You're using the processor's format when reading the guest
> >> vmcs.  But we don't have to do that, we can use the shadow_vmcs
> >> structure (and a memcpy).
> >>
> > I'm sorry I don't understand your comment can u elaborate ?
> >
> >
>
> See previous comment.  Basically you can do
>
>    struct shadow_vmcs *svmcs = kmap_atomic(gpa_to_page(vmx->vmptr));
>    printk("guest_cs = %x\n", svmcs->guest_cs_selector);
See above.
>
> instead of
>
>    vmptrld(gpa_to_hpa(vmx->vmptr))
>    printk("guest_cs = %x\n", vmcs_read16(GUEST_CS_SELECTOR));
>
>
> --
> error compiling committee.c: too many arguments to function
>

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html