RE: [PATCH] KVM: x86: Implement PCID/INVPCID for guests with EPT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On 05/10/2012 03:32 AM, Mao, Junjie wrote:
> > This patch handles PCID/INVPCID for guests.
> >
> > Process-context identifiers (PCIDs) are a facility by which a logical processor
> may cache information for multiple linear-address spaces so that the processor
> may retain cached information when software switches to a different
> linear-address space. Refer to section 4.10.1 in IA32 Intel Software Developer's
> Manual Volume 3A for details.
> >
> > For guests with EPT, the PCID feature is enabled and INVPCID behaves as
> running natively.
> > For guests without EPT, the PCID feature is disabled and INVPCID triggers
> #UD.
> >
> >
> > diff --git a/arch/x86/include/asm/kvm_host.h
> > b/arch/x86/include/asm/kvm_host.h index 74c9edf..bb9a707 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -52,7 +52,7 @@
> >  #define CR4_RESERVED_BITS
> \
> >  	(~(unsigned long)(X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD |
> X86_CR4_DE\
> >  			  | X86_CR4_PSE | X86_CR4_PAE | X86_CR4_MCE     \
> > -			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR  \
> > +			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR |
> X86_CR4_PCIDE \
> >  			  | X86_CR4_OSXSAVE | X86_CR4_SMEP |
> X86_CR4_RDWRGSFS \
> >  			  | X86_CR4_OSXMMEXCPT | X86_CR4_VMXE))
> 
> We should hide cr4.pcide from nested vmx, until we prepare that code to
> handle it.

I'll hide it from nested guests.

> 
> > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index
> > d2bd719..ba00789 100644
> > --- a/arch/x86/kvm/vmx.c
> > +++ b/arch/x86/kvm/vmx.c
> > @@ -413,6 +413,7 @@ struct vcpu_vmx {
> >  	u32 exit_reason;
> >
> >  	bool rdtscp_enabled;
> > +	bool invpcid_enabled;
> >
> >  	/* Support for a guest hypervisor (nested VMX) */
> >  	struct nested_vmx nested;
> > @@ -839,6 +840,12 @@ static inline bool cpu_has_vmx_rdtscp(void)
> >  		SECONDARY_EXEC_RDTSCP;
> >  }
> >
> > +static bool vmx_pcid_supported(void)
> > +{
> > +	/* Enable PCID for non-ept guests may cause performance regression
> > +*/
> 
> Why is that?

For guests using shadow page tables, every INVPCID must be intercepted so that changes in guest page tables can be reflected on the shadow ones, which brings about performance troubles. Without INVPCID, the PCID feature has little benefits. As a result, PCID/INVPCID is not exposed to non-ept guests. Sorry for being unclear in the comment.

> 
> > +	return enable_ept && (boot_cpu_data.x86_capability[4] &
> > +bit(X86_FEATURE_PCID)); }
> > +
> >  /*
> >   * Swap MSR entry in host/guest MSR entry array.
> >   */
> > @@ -4337,8 +4352,14 @@ static int handle_set_cr0(struct kvm_vcpu *vcpu,
> unsigned long val)
> >  			return 1;
> >  		vmcs_writel(CR0_READ_SHADOW, val);
> >  		return 0;
> > -	} else
> > +	} else {
> > +		unsigned long old_cr0 = kvm_read_cr0(vcpu);
> > +		if ((old_cr0 & X86_CR0_PG) && !(val & X86_CR0_PG) &&
> > +		    (kvm_read_cr4(vcpu) & X86_CR4_PCIDE))
> 
> Use kvm_read_cr4_bits(), it's slightly faster.  Also move this to x86.c.
> 
> > +			return 1;
> > +
> >  		return kvm_set_cr0(vcpu, val);
> > +	}
> >  }
> >
> >  static int handle_set_cr4(struct kvm_vcpu *vcpu, unsigned long val)
> > @@ -4349,8 +4370,26 @@ static int handle_set_cr4(struct kvm_vcpu *vcpu,
> unsigned long val)
> >  			return 1;
> >  		vmcs_writel(CR4_READ_SHADOW, val);
> >  		return 0;
> > -	} else
> > -		return kvm_set_cr4(vcpu, val);
> > +	} else {
> > +		unsigned long old_cr4 = kvm_read_cr4(vcpu);
> > +		int ret = 1;
> > +
> > +		if ((val & X86_CR4_PCIDE) && !(old_cr4 & X86_CR4_PCIDE)) {
> > +			if (!guest_cpuid_has_pcid(vcpu))
> > +				return ret;
> > +
> > +			/* PCID can not be enabled when cr3[11:0]!=000H or
> EFER.LMA=0 */
> > +			if ((kvm_read_cr3(vcpu) & X86_CR3_PCID_MASK)
> || !is_long_mode(vcpu))
> > +				return ret;
> > +		}
> > +
> > +		ret = kvm_set_cr4(vcpu, val);
> > +
> > +		if (!ret && (!(val & X86_CR4_PCIDE) && (old_cr4 &
> X86_CR4_PCIDE)))
> > +			kvm_mmu_reset_context(vcpu);
> > +
> > +		return ret;
> > +	}
> 
> Move to x86.c please.
> 
> >  }
> >
> >  /* called to set cr0 as approriate for clts instruction exit. */ @@
> > -6420,6 +6459,23 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
> >  			}
> >  		}
> >  	}
> > +
> > +	vmx->invpcid_enabled = false;
> > +	if (vmx_pcid_supported()) {
> > +		exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
> > +		if (exec_control & SECONDARY_EXEC_ENABLE_INVPCID) {
> > +			best = kvm_find_cpuid_entry(vcpu, 0x1, 0);
> > +			if (best && (best->ecx & bit(X86_FEATURE_PCID)))
> > +				vmx->invpcid_enabled = true;
> > +			else {
> > +				exec_control &= ~SECONDARY_EXEC_ENABLE_INVPCID;
> > +				vmcs_write32(SECONDARY_VM_EXEC_CONTROL,
> > +						exec_control);
> > +				best = kvm_find_cpuid_entry(vcpu, 0x7, 0);
> > +				best->ecx &= ~bit(X86_FEATURE_INVPCID);
> > +			}
> > +		}
> > +	}
> >  }
> >
> >
> 
> If we enter a nested guest (which is running without PCID), we need either to
> handle INVPCID exits (and inject a #UD) or disable INVPCID in exec controls.
> The first is faster since it doesn't involve VMWRITEs.
> If we do that, we don't need this code (since it will work for non-nested guests
> as well).

I'm not that familiar with how nested guests work. So excuse me for a possibly silly question: if we choose to trigger INVPCID exits and inject #UD for both non-nested and nested guests without INVPCID, that means 'INVLPG exiting' should also be set (which is a must for triggering INVPCID exits). Can it cause performance problems for non-nested ept guests?

> 
> --
> error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux