Re: [PATCH v2] KVM: x86: Implement PCID/INVPCID for guests with EPT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 14, 2012 at 06:25:40AM +0000, Mao, Junjie wrote:
> This patch handles PCID/INVPCID for guests.
> 
> Process-context identifiers (PCIDs) are a facility by which a logical processor
> may cache information for multiple linear-address spaces so that the processor
> may retain cached information when software switches to a different linear
> address space. Refer to section 4.10.1 in IA32 Intel Software Developer's Manual
> Volume 3A for details.
> 
> For guests with EPT, the PCID feature is enabled and INVPCID behaves as running
> natively.
> For guests without EPT, the PCID feature is disabled and INVPCID triggers #UD.
> 
> Changes from v1:
> 	Move cr0/cr4 writing checks to x86.c
> 	Update comments for the reason why PCID is disabled for non-EPT guests
> 	Do not support PCID/INVPCID for nested guests at present
> 	Clean up useless symbols
> 
> Signed-off-by: Junjie Mao <junjie.mao@xxxxxxxxx>
> +++ b/arch/x86/kvm/vmx.c
> @@ -1711,6 +1711,18 @@ static bool vmx_rdtscp_supported(void)
>  	return cpu_has_vmx_rdtscp();
>  }
>  
> +static bool vmx_pcid_supported(void)
> +{
> +	/*
> +	 * Enable INVPCID for non-ept guests may cause performance regression,
> +	 * and without INVPCID, PCID has little benefits. So disable them all
> +	 * for non-ept guests.
> +	 *
> +	 * PCID is not supported for nested guests yet.
> +	 */
> +	return enable_ept && (boot_cpu_data.x86_capability[4] & bit(X86_FEATURE_PCID)) && !cpu_has_hypervisor;
> +}

The comment Avi made was regarding running a nested guest, not running
_as_ a nested guest (which is what cpu_has_hypervisor is about).

You can disable INVPCID exec control (which #UDs), if its in Level-2
guest mode (see if_guest_mode()), and restore the Level-1 value when
leaving nested mode.

> +
>  /*
>   * Swap MSR entry in host/guest MSR entry array.
>   */
> @@ -2425,6 +2437,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
>  			SECONDARY_EXEC_UNRESTRICTED_GUEST |
>  			SECONDARY_EXEC_PAUSE_LOOP_EXITING |
>  			SECONDARY_EXEC_RDTSCP;
> +		if (vmx_pcid_supported())
> +			opt2 |= SECONDARY_EXEC_ENABLE_INVPCID;


You should allow ENABLE_INVPCID control to be set unconditionally here, 
and adjusted in vmx_secondary_exec_control().

(note that "enable_ept" might be cleared after setup_vmcs_config).

>  		if (adjust_vmx_controls(min2, opt2,
>  					MSR_IA32_VMX_PROCBASED_CTLS2,
>  					&_cpu_based_2nd_exec_control) < 0)
> @@ -6420,6 +6434,20 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
>  			}
>  		}
>  	}
> +
> +	if (vmx_pcid_supported()) {
> +		best = kvm_find_cpuid_entry(vcpu, 0x1, 0);
> +		if (!best || !(best->ecx & bit(X86_FEATURE_PCID))) {
> +			/* Hiding INVPCID when PCID is not exposed. */
> +			exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
> +			exec_control &= ~SECONDARY_EXEC_ENABLE_INVPCID;
> +			vmcs_write32(SECONDARY_VM_EXEC_CONTROL,
> +				     exec_control);
> +			best = kvm_find_cpuid_entry(vcpu, 0x7, 0);
> +			if (best)
> +				best->ecx &= ~bit(X86_FEATURE_INVPCID);
> +		}
> +	}

- If X86_FEATURE_CPUID bit is set by guest, but X86_FEATURE_INVPCID is
  cleared, this allows invpcid to execute (which is wrong, it should
  #UD).
- Must enable vm_exec control bit if cpuid reports the feature enabled,
  not only disable it.

>  }
>  
>  static void vmx_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry)
> @@ -7154,6 +7182,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
>  	.cpuid_update = vmx_cpuid_update,
>  
>  	.rdtscp_supported = vmx_rdtscp_supported,
> +	.pcid_supported = vmx_pcid_supported,
>  
>  	.set_supported_cpuid = vmx_set_supported_cpuid,
>  
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c9d99e5..f930597 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -527,6 +527,10 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
>  			return 1;
>  	}
>  
> +	if ((old_cr0 & X86_CR0_PG) && !(cr0 & X86_CR0_PG) &&
> +	    kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE))
> +		return 1;
> +
>  	kvm_x86_ops->set_cr0(vcpu, cr0);

For completeness, kvm_set_cr3 should deal with the CR3 bits. It is used
by nested VMX for example.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux