RE: [PATCH 31/31] nVMX: Documentation

"Tian, Kevin" <kevin.tian@xxxxxxxxx> · Wed, 25 May 2011 20:11:13 +0800

> From: Nadav Har'El
> Sent: Wednesday, May 25, 2011 7:55 PM
> 
> On Wed, May 25, 2011, Tian, Kevin wrote about "RE: [PATCH 31/31] nVMX:
> Documentation":
> > > +On Intel processors, KVM uses Intel's VMX (Virtual-Machine eXtensions)
> > > +to easily and efficiently run guest operating systems. Normally, these
> guests
> > > +*cannot* themselves be hypervisors running their own guests, because in
> > > VMX,
> > > +guests cannot use VMX instructions.
> >
> > "because in VMX, guests cannot use VMX instructions" looks not correct or
> else
> > you can't add nVMX support. :-) It's just because currently KVM doesn't
> emulate
> > those VMX instructions.
> 
> It depends on whether you look on the half-empty or half-full part of the
> glass ;-)
> 
> The VMX instructions, when used in L1, do trap - as mandated by Popek and
> Goldberg's theorem (that sensitive instructions must trap) - but they
> don't "just work" like, for example, arithmetic instructions just work -
> they need to be emulated by the VMM.
> 
> > > +Terminology
> > > +-----------
> > > +
> > > +Single-level virtualization has two levels - the host (KVM) and the guests.
> > > +In nested virtualization, we have three levels: The host (KVM), which we
> call
> > > +L0, the guest hypervisor, which we call L1, and its nested guest, which we
> > > +call L2.
> >
> > Add a brief introduction about vmcs01/vmcs02/vmcs12 is also helpful here,
> given
> > that this doc is a centralized place to gain quick picture of the nested VMX.
> 
> I'm adding now a short mention. However, I think this file should be viewed
> as a user's guide, not a developer's guide. Developers should probably read
> our full paper, where this terminology is explained, as well as how vmcs02
> is related to the two others.

I agree with the purpose of this doc. 

> 
> > > +Additional patches for running Windows under guest KVM, and Linux under
> > > +guest VMware server, and support for nested EPT, are currently running in
> > > +the lab, and will be sent as follow-on patchsets.
> >
> > any plan on nested VTD?
> 
> Yes, for some definition of Yes ;-)
> 
> We do have an experimental nested IOMMU implementation: In our nested
> VMX
> paper we showed how giving L1 an IOMMU allows for efficient nested device
> assignment (L0 assigns a PCI device to L1, and L1 does the same to L2).
> In that work we used a very simplistic "paravirtual" IOMMU instead of fully
> emulating an IOMMU for L1.
> Later, we did develop a full emulation of an IOMMU for L1, although we didn't
> test it in the context of nested VMX (we used it to allow L1 to use an IOMMU
> for better DMA protection inside the guest).
> 
> The IOMMU emulation work was done by Nadav Amit, Muli Ben-Yehuda, et al.,
> and will be described in the upcoming Usenix ATC conference
> (http://www.usenix.org/event/atc11/tech/techAbstracts.html#Amit).
> After the conference in June, the paper will be available at this URL:
> http://www.usenix.org/event/atc11/tech/final_files/Amit.pdf
> 
> If there is interest, they can perhaps contribute their work to
> KVM (and QEMU) - if you're interested, please get in touch with them directly.

Thanks and good to know those information

> 
> > It'd be good to provide a list of known supported features. In your current
> code,
> > people have to look at code to understand current status. If you can keep a
> > supported and verified feature list here, it'd be great.
> 
> It will be even better to support all features ;-)
> 
> But seriously, the VMX spec is hundreds of pages long, with hundreds of
> features, sub-features, and sub-sub-features and myriads of subcase-of-
> subfeature and combinations thereof, so I don't think such a list would be
> practical - or ever be accurate.

no need for all subfeatures, a list of possibly a dozen features which people
once enabled them one-by-one is applausive, especially for things which
may accelerate L2 perf, such as virtual NMI, tpr shadow, virtual x2APIC, ... 

> 
> In the "Known Limitations" section of this document, I'd like to list major
> features which are missing, and perhaps more importantly - L1 and L2
> guests which are known NOT to work.

yes, that info is also important and thus people can easily reproduce your
success.

> 
> By the way, it appears that you've been going over the patches in increasing
> numerical order, and this is the last patch ;-) Have you finished your
> review iteration?
> 

yes, I've finished my review on all of your v10 patches. :-)

Thanks
Kevin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html