Add a module option "nested_ept" determining whether to enable Nested EPT. Nested EPT means emulating EPT for an L1 guest so that L1 can use EPT when running a nested guest L2. When L1 uses EPT, it allows the L2 guest to set its own cr3 and take its own page faults without either of L0 or L1 getting involved. This often significanlty improves L2's performance over the previous two alternatives (shadow page tables over ept, and shadow page tables over shadow page tables). nested_ept is currently enabled by default (when nested VMX is enabled), unless L0 doesn't have EPT or disabled it with ept=0. Users would not normally want to explicitly disable this option. One reason why one might want to disable it is to force L1 to make due without the EPT capability, when anticipating a future need to migrate this L1 to another host which doesn't have EPT. Note that currently there is no API to turn off nested EPT for just a single L1 guest. However, obviously, an individual L1 guest may choose not to use EPT - the nested_cpu_has_ept() checks if L1 actually used EPT when running L2. In the future, we can support emulation of EPT for L1 *always*, even when L0 itself doesn't have EPT. This so-called "EPT on shadow page tables" mode has some theoretical advantages over the baseline "shadow page tables on shadow page tables" mode typically used when EPT is not available to L0 - namely that L2's cr3 changes and page faults can be handled in L0 and do not need to be propagated to L1. However, currently we do not support this mode, and it is becoming less interesting as newer processors all support EPT. Signed-off-by: Nadav Har'El <nyh@xxxxxxxxxx> --- arch/x86/kvm/vmx.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) --- .before/arch/x86/kvm/vmx.c 2011-11-10 11:33:58.000000000 +0200 +++ .after/arch/x86/kvm/vmx.c 2011-11-10 11:33:58.000000000 +0200 @@ -83,6 +83,10 @@ module_param(fasteoi, bool, S_IRUGO); static int __read_mostly nested = 0; module_param(nested, bool, S_IRUGO); +/* Whether L0 emulates EPT for its L1 guests. It doesn't mean L1 must use it */ +static int __read_mostly nested_ept = 1; +module_param(nested_ept, bool, S_IRUGO); + #define KVM_GUEST_CR0_MASK_UNRESTRICTED_GUEST \ (X86_CR0_WP | X86_CR0_NE | X86_CR0_NW | X86_CR0_CD) #define KVM_GUEST_CR0_MASK \ @@ -875,6 +879,11 @@ static inline bool nested_cpu_has_virtua return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS; } +static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12) +{ + return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_EPT); +} + static inline bool is_exception(u32 intr_info) { return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK)) @@ -2642,6 +2651,9 @@ static __init int hardware_setup(void) if (!cpu_has_vmx_ple()) ple_gap = 0; + if (!nested || !enable_ept) + nested_ept = 0; + if (nested) nested_vmx_setup_ctls_msrs(); -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html