On Mon, May 14, 2012 at 06:25:40AM +0000, Mao, Junjie wrote: > This patch handles PCID/INVPCID for guests. > > Process-context identifiers (PCIDs) are a facility by which a logical processor > may cache information for multiple linear-address spaces so that the processor > may retain cached information when software switches to a different linear > address space. Refer to section 4.10.1 in IA32 Intel Software Developer's Manual > Volume 3A for details. > > For guests with EPT, the PCID feature is enabled and INVPCID behaves as running > natively. > For guests without EPT, the PCID feature is disabled and INVPCID triggers #UD. > > Changes from v1: > Move cr0/cr4 writing checks to x86.c > Update comments for the reason why PCID is disabled for non-EPT guests > Do not support PCID/INVPCID for nested guests at present > Clean up useless symbols > > Signed-off-by: Junjie Mao <junjie.mao@xxxxxxxxx> > +++ b/arch/x86/kvm/vmx.c > @@ -1711,6 +1711,18 @@ static bool vmx_rdtscp_supported(void) > return cpu_has_vmx_rdtscp(); > } > > +static bool vmx_pcid_supported(void) > +{ > + /* > + * Enable INVPCID for non-ept guests may cause performance regression, > + * and without INVPCID, PCID has little benefits. So disable them all > + * for non-ept guests. > + * > + * PCID is not supported for nested guests yet. > + */ > + return enable_ept && (boot_cpu_data.x86_capability[4] & bit(X86_FEATURE_PCID)) && !cpu_has_hypervisor; > +} The comment Avi made was regarding running a nested guest, not running _as_ a nested guest (which is what cpu_has_hypervisor is about). You can disable INVPCID exec control (which #UDs), if its in Level-2 guest mode (see if_guest_mode()), and restore the Level-1 value when leaving nested mode. > + > /* > * Swap MSR entry in host/guest MSR entry array. > */ > @@ -2425,6 +2437,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf) > SECONDARY_EXEC_UNRESTRICTED_GUEST | > SECONDARY_EXEC_PAUSE_LOOP_EXITING | > SECONDARY_EXEC_RDTSCP; > + if (vmx_pcid_supported()) > + opt2 |= SECONDARY_EXEC_ENABLE_INVPCID; You should allow ENABLE_INVPCID control to be set unconditionally here, and adjusted in vmx_secondary_exec_control(). (note that "enable_ept" might be cleared after setup_vmcs_config). > if (adjust_vmx_controls(min2, opt2, > MSR_IA32_VMX_PROCBASED_CTLS2, > &_cpu_based_2nd_exec_control) < 0) > @@ -6420,6 +6434,20 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu) > } > } > } > + > + if (vmx_pcid_supported()) { > + best = kvm_find_cpuid_entry(vcpu, 0x1, 0); > + if (!best || !(best->ecx & bit(X86_FEATURE_PCID))) { > + /* Hiding INVPCID when PCID is not exposed. */ > + exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL); > + exec_control &= ~SECONDARY_EXEC_ENABLE_INVPCID; > + vmcs_write32(SECONDARY_VM_EXEC_CONTROL, > + exec_control); > + best = kvm_find_cpuid_entry(vcpu, 0x7, 0); > + if (best) > + best->ecx &= ~bit(X86_FEATURE_INVPCID); > + } > + } - If X86_FEATURE_CPUID bit is set by guest, but X86_FEATURE_INVPCID is cleared, this allows invpcid to execute (which is wrong, it should #UD). - Must enable vm_exec control bit if cpuid reports the feature enabled, not only disable it. > } > > static void vmx_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry) > @@ -7154,6 +7182,7 @@ static struct kvm_x86_ops vmx_x86_ops = { > .cpuid_update = vmx_cpuid_update, > > .rdtscp_supported = vmx_rdtscp_supported, > + .pcid_supported = vmx_pcid_supported, > > .set_supported_cpuid = vmx_set_supported_cpuid, > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index c9d99e5..f930597 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -527,6 +527,10 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) > return 1; > } > > + if ((old_cr0 & X86_CR0_PG) && !(cr0 & X86_CR0_PG) && > + kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE)) > + return 1; > + > kvm_x86_ops->set_cr0(vcpu, cr0); For completeness, kvm_set_cr3 should deal with the CR3 bits. It is used by nested VMX for example. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html