On 04/18/2013 02:34 PM, Abel Gordon wrote: > This series of patches implements shadow-vmcs capability for nested VMX. > > Shadow-vmcs - background and overview: > > In Intel VMX, vmread and vmwrite privileged instructions are used by the > hypervisor to read and modify the guest and host specifications (VMCS). In a > nested virtualization environment, L1 executes multiple vmread and vmwrite > instruction to handle a single L2 exit. Each vmread and vmwrite executed by L1 > traps (cause an exit) to the L0 hypervisor (KVM). L0 emulates the instruction > behaviour and resumes L1 execution. > > Removing the need to trap and emulate these special instructions reduces the > number of exits and improves nested virtualization performance. As it was first > evaluated in [1], exit-less vmread and vmwrite can reduce nested virtualization > overhead up-to 40%. > > Intel introduced a new feature to their processors called shadow-vmcs. Using > shadow-vmcs, L0 can configure the processor to let L1 running in guest-mode > access VMCS12 fields using vmread and vmwrite instructions but without causing > an exit to L0. The VMCS12 fields' data is stored in a shadow-vmcs controlled > by L0. > > Shadow-vmcs - design considerations: > > A shadow-vmcs is processor-dependent and must be accessed by L0 or L1 using > vmread and vmwrite instructions. With nested virtualization we aim to abstract > the hardware from the L1 hypervisor. Thus, to avoid hardware dependencies we > prefered to keep the software defined VMCS12 format as part of L1 address space > and hold the processor-specific shadow-vmcs format only in L0 address space. > In other words, the shadow-vmcs is used by L0 as an accelerator but the format > and content is never exposed to L1 directly. L0 syncs the content of the > processor-specific shadow vmcs with the content of the software-controlled > VMCS12 format. > > We could have been kept the processor-specific shadow-vmcs format in L1 address > space to avoid using the software defined VMCS12 format, however, this type of > design/implementation would have been created hardware dependencies and > would complicate other capabilities (e.g. Live Migration of L1). > > Changes since v1: > 1) Added sync_shadow_vmcs flag used to indicate when the content of VMCS12 > must be copied to the shadow vmcs. The flag value is checked during > vmx_vcpu_run. > 2) Code quality improvements > > Changes since v2: > 1) Allocate shadow vmcs only once per VCPU on handle_vmxon and re-use the > same instance for multiple VMCS12s > 2) More code quality improvements > > Changes since v3: > 1) Fixed VMXON emulation (new patch). > Previous nVMX code didn't verify if L1 is already in root mode (VMXON > was previously called). Now we call nested_vmx_failValid if VMX is > already ON. This is requird to avoid host leaks (due to shadow vmcs > allocation) if L1 repetedly executes VMXON. > 2) Improved comment: clarified we do not shadow fields that are modified > when L1 executes vmx instructions like the VM_INSTRUCTION_ERROR field. > > Changes since v4: > 1) Fixed free_nested: we now free the shadow vmcs also > when there is no current vmcs. > > Acknowledgments: > > Many thanks to > "Natapov, Gleb" <gleb@xxxxxxxxxx> > "Xu, Dongxiao" <dongxiao.xu@xxxxxxxxx> > "Nakajima, Jun" <jun.nakajima@xxxxxxxxx> > "Har'El, Nadav" <nadav@xxxxxxxxxxxx> > > for the insightful discussions, comments and reviews. > > > These patches were easily created and maintained using > Patchouli -- patch creator > http://patchouli.sourceforge.net/ > > > [1] "The Turtles Project: Design and Implementation of Nested Virtualization", > http://www.usenix.org/events/osdi10/tech/full_papers/Ben-Yehuda.pdf > Hard to keep up :) Reviewed-by: Orit Wasserman <owasserm@xxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html