Re: [RFC PATCH v3 26/58] KVM: x86/pmu: Manage MSR interception for IA32_PERF_GLOBAL_CTRL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/20/2024 2:16 AM, Sean Christopherson wrote:
> On Thu, Aug 01, 2024, Mingwei Zhang wrote:
>> ---
>>  arch/x86/include/asm/vmx.h |   1 +
>>  arch/x86/kvm/vmx/vmx.c     | 117 +++++++++++++++++++++++++++++++------
>>  arch/x86/kvm/vmx/vmx.h     |   3 +-
>>  3 files changed, 103 insertions(+), 18 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
>> index d77a31039f24..5ed89a099533 100644
>> --- a/arch/x86/include/asm/vmx.h
>> +++ b/arch/x86/include/asm/vmx.h
>> @@ -106,6 +106,7 @@
>>  #define VM_EXIT_CLEAR_BNDCFGS                   0x00800000
>>  #define VM_EXIT_PT_CONCEAL_PIP			0x01000000
>>  #define VM_EXIT_CLEAR_IA32_RTIT_CTL		0x02000000
>> +#define VM_EXIT_SAVE_IA32_PERF_GLOBAL_CTRL      0x40000000
> Please add a helper in capabilities.h:
>
> static inline bool cpu_has_save_perf_global_ctrl(void)
> {
> 	return vmcs_config.vmexit_ctrl & VM_EXIT_SAVE_IA32_PERF_GLOBAL_CTRL;
> }

Sure.


>
>>  #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR	0x00036dff
>>  
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index 339742350b7a..34a420fa98c5 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -4394,6 +4394,97 @@ static u32 vmx_pin_based_exec_ctrl(struct vcpu_vmx *vmx)
>>  	return pin_based_exec_ctrl;
>>  }
>>  
>> +static void vmx_set_perf_global_ctrl(struct vcpu_vmx *vmx)
> This is a misleading and inaccurate name.  It does far more than "set" PERF_GLOBAL_CTRL,
> it arguably doesn't ever "set" the MSR, and it gets the VMWRITE for the guest field
> wrong too.
>
>> +{
>> +	u32 vmentry_ctrl = vm_entry_controls_get(vmx);
>> +	u32 vmexit_ctrl = vm_exit_controls_get(vmx);
>> +	struct vmx_msrs *m;
>> +	int i;
>> +
>> +	if (cpu_has_perf_global_ctrl_bug() ||
> Note, cpu_has_perf_global_ctrl_bug() broken and needs to be purged:
> https://lore.kernel.org/all/20241119011433.1797921-1-seanjc@xxxxxxxxxx
>
> Note #2, as mentioned earlier, the mediated PMU should take a hard depenency on
> the load/save controls.
>
> On to this code, it fails to enable the load/save controls, e.g. if userspace
> does KVM_SET_CPUID2 without a PMU, then KVM_SET_CPUID2 with a PMU.  In that case,
> KVM will fail to set the control bits, and will fallback to the slow MSR load/save
> lists.
>
> With all of the above and other ideas combined, something like so:
>
> 	bool set = enable_mediated_pmu && kvm_pmu_has_perf_global_ctrl();
>
> 	vm_entry_controls_changebit(vmx, VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL, set);
> 	vm_exit_controls_changebit(vmx,
> 				   VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |
> 				   VM_EXIT_SAVE_IA32_PERF_GLOBAL_CTRL, set);
>
>
> And I vote to put this in intel_pmu_refresh(); that avoids needing to figure out
> a name for the helper, while giving more flexibililty on the local variable name.
>
> Oh!  Definitely put it in intel_pmu_refresh(), because the RDPMC and MSR
> interception logic needs to be there.  E.g. toggling CPU_BASED_RDPMC_EXITING
> based solely on CPUID won't do the right thing if KVM ends up making the behavior
> depend on PERF_CAPABILITIES.
>
> Ditto for MSRs.  Though until my patch/series that drops kvm_pmu_refresh() from
> kvm_pmu_init() lands[*], trying to update MSR intercepts during refresh() will hit
> a NULL pointer deref as it's currently called before vmcs01 is allocated :-/
>
> I expect to land that series before mediated PMU, but I don't think it makes sense
> to take an explicit dependency for this series.  To fudge around the issue, maybe
> do this for the next version?
>
> static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
> {
> 	__intel_pmu_refresh(vcpu);
>
> 	/*
> 	 * FIXME: Drop the MSR bitmap check if/when kvm_pmu_init() no longer
> 	 *        calls kvm_pmu_refresh(), i.e. when KVM refreshes the PMU only
> 	 *        after vmcs01 is allocated.
> 	 */
> 	if (to_vmx(vcpu)->vmcs01.msr_bitmap)
> 		intel_update_msr_intercepts(vcpu);
>
> 	vm_entry_controls_changebit(vmx, VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL,
> 				    enable_mediated_pmu && kvm_pmu_has_perf_global_ctrl());
>
> 	vm_exit_controls_changebit(vmx,
> 				   VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |
> 				   VM_EXIT_SAVE_IA32_PERF_GLOBAL_CTRL,
> 				   enable_mediated_pmu && kvm_pmu_has_perf_global_ctrl());
> }
>
> or with a local variable for "enable_mediated_pmu && kvm_pmu_has_perf_global_ctrl()".
> I can't come up with a decent name. :-)
>
> [*] https://lore.kernel.org/all/20240517173926.965351-10-seanjc@xxxxxxxxxx

Sure. This looks better.


>
>> +	    !is_passthrough_pmu_enabled(&vmx->vcpu)) {
>> +		vmentry_ctrl &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
>> +		vmexit_ctrl &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
>> +		vmexit_ctrl &= ~VM_EXIT_SAVE_IA32_PERF_GLOBAL_CTRL;
>> +	}
>> +
>> +	if (is_passthrough_pmu_enabled(&vmx->vcpu)) {
>> +		/*
>> +		 * Setup auto restore guest PERF_GLOBAL_CTRL MSR at vm entry.
>> +		 */
>> +		if (vmentry_ctrl & VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL) {
>> +			vmcs_write64(GUEST_IA32_PERF_GLOBAL_CTRL, 0);
> This incorrectly clobbers the guest's value.  A simple way to handle this is to
> always propagate writes to PERF_GLOBAL_CTRL to the VMCS, if the write is allowed
> and enable_mediated_pmu.  I.e. ensure GUEST_IA32_PERF_GLOBAL_CTRL is up-to-date
> regardless of whether or not it's configured to be loaded.  Then there's no need
> to write it here.

For mediated vPMU, I think we can move this into intel_pmu_refresh() as
well, and just after the vm_entry_controls_changebit() helpers. the value
should be supported gp counters mask which respects the behavior of
PERF_GLOBAL_CTRL reset.


>
>> +		} else {
>> +			m = &vmx->msr_autoload.guest;
>> +			i = vmx_find_loadstore_msr_slot(m, MSR_CORE_PERF_GLOBAL_CTRL);
>> +			if (i < 0) {
>> +				i = m->nr++;
>> +				vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, m->nr);
>> +			}
>> +			m->val[i].index = MSR_CORE_PERF_GLOBAL_CTRL;
>> +			m->val[i].value = 0;
>> +		}
>> +		/*
>> +		 * Setup auto clear host PERF_GLOBAL_CTRL msr at vm exit.
>> +		 */
>> +		if (vmexit_ctrl & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL) {
>> +			vmcs_write64(HOST_IA32_PERF_GLOBAL_CTRL, 0);
> This should be unnecessary.  KVM should clear HOST_IA32_PERF_GLOBAL_CTRL in
> vmx_set_constant_host_state() if enable_mediated_pmu is true.  Arguably, it might
> make sense to clear it unconditionally, but with a comment explaining that it's
> only actually constant for the mediated PMU.

Sure. would move this clearing into vmx_set_constant_host_state(). Yeah, I
suppose it can be cleared unconditionally since
VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL bit never be set for legacy emulated vPMU.


>
> And if the mediated PMU requires the VMCS knobs, then all of the load/store list
> complexity goes away.
>
>>  static u32 vmx_vmentry_ctrl(void)
>>  {
>>  	u32 vmentry_ctrl = vmcs_config.vmentry_ctrl;
>> @@ -4401,17 +4492,10 @@ static u32 vmx_vmentry_ctrl(void)
>>  	if (vmx_pt_mode_is_system())
>>  		vmentry_ctrl &= ~(VM_ENTRY_PT_CONCEAL_PIP |
>>  				  VM_ENTRY_LOAD_IA32_RTIT_CTL);
>> -	/*
>> -	 * IA32e mode, and loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically.
>> -	 */
>> -	vmentry_ctrl &= ~(VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
>> -			  VM_ENTRY_LOAD_IA32_EFER |
>> -			  VM_ENTRY_IA32E_MODE);
>> -
>> -	if (cpu_has_perf_global_ctrl_bug())
>> -		vmentry_ctrl &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
>> -
>> -	return vmentry_ctrl;
>> +	 /*
>> +	  * IA32e mode, and loading of EFER is toggled dynamically.
>> +	  */
>> +	return vmentry_ctrl &= ~(VM_ENTRY_LOAD_IA32_EFER | VM_ENTRY_IA32E_MODE);
> With my above suggestion, these changes are unnecessary.  If enable_mediated_pmu
> is false, or the vCPU doesn't have a PMU, clearing the controls is correct.  And
> when the vCPU is gifted a PMU, KVM will explicitly enabled the controls.
Yes.

>
> To discourage incorrect usage of these helpers maybe rename them to
> vmx_get_initial_{vmentry,vmexit}_ctrl()?

Sure.


>
>>  }
>>  
>>  static u32 vmx_vmexit_ctrl(void)
>> @@ -4429,12 +4513,8 @@ static u32 vmx_vmexit_ctrl(void)
>>  		vmexit_ctrl &= ~(VM_EXIT_PT_CONCEAL_PIP |
>>  				 VM_EXIT_CLEAR_IA32_RTIT_CTL);
>>  
>> -	if (cpu_has_perf_global_ctrl_bug())
>> -		vmexit_ctrl &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
>> -
>> -	/* Loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically */
>> -	return vmexit_ctrl &
>> -		~(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | VM_EXIT_LOAD_IA32_EFER);
> But this code needs to *add* clearing of VM_EXIT_SAVE_IA32_PERF_GLOBAL_CTRL.

Sure.






[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux