Re: [RFC PATCH v3 18/58] KVM: x86/pmu: Introduce enable_passthrough_pmu module parameter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/19/2024 10:30 PM, Sean Christopherson wrote:
> As per my feedback in the initial RFC[*]:
>
>  2. The module param absolutely must not be exposed to userspace until all patches
>     are in place.  The easiest way to do that without creating dependency hell is
>     to simply not create the module param.
>
> [*] https://lore.kernel.org/all/ZhhQBHQ6V7Zcb8Ve@xxxxxxxxxx

Sure. It looks we missed this comment. Would address it.


>
> On Thu, Aug 01, 2024, Mingwei Zhang wrote:
>> Introduce enable_passthrough_pmu as a RO KVM kernel module parameter. This
>> variable is true only when the following conditions satisfies:
>>  - set to true when module loaded.
>>  - enable_pmu is true.
>>  - is running on Intel CPU.
>>  - supports PerfMon v4.
>>  - host PMU supports passthrough mode.
>>
>> The value is always read-only because passthrough PMU currently does not
>> support features like LBR and PEBS, while emualted PMU does. This will end
>> up with two different values for kvm_cap.supported_perf_cap, which is
>> initialized at module load time. Maintaining two different perf
>> capabilities will add complexity. Further, there is not enough motivation
>> to support running two types of PMU implementations at the same time,
>> although it is possible/feasible in reality.
>>
>> Finally, always propagate enable_passthrough_pmu and perf_capabilities into
>> kvm->arch for each KVM instance.
>>
>> Co-developed-by: Xiong Zhang <xiong.y.zhang@xxxxxxxxxxxxxxx>
>> Signed-off-by: Xiong Zhang <xiong.y.zhang@xxxxxxxxxxxxxxx>
>> Signed-off-by: Mingwei Zhang <mizhang@xxxxxxxxxx>
>> Signed-off-by: Dapeng Mi <dapeng1.mi@xxxxxxxxxxxxxxx>
>> Tested-by: Yongwei Ma <yongwei.ma@xxxxxxxxx>
>> ---
>>  arch/x86/include/asm/kvm_host.h |  1 +
>>  arch/x86/kvm/pmu.h              | 14 ++++++++++++++
>>  arch/x86/kvm/vmx/vmx.c          |  7 +++++--
>>  arch/x86/kvm/x86.c              |  8 ++++++++
>>  arch/x86/kvm/x86.h              |  1 +
>>  5 files changed, 29 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index f8ca74e7678f..a15c783f20b9 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -1406,6 +1406,7 @@ struct kvm_arch {
>>  
>>  	bool bus_lock_detection_enabled;
>>  	bool enable_pmu;
>> +	bool enable_passthrough_pmu;
> Again, as I suggested/requested in the initial RFC[*], drop the per-VM flag as well
> as kvm_pmu.passthrough.  There is zero reason to cache the module param.  KVM
> should always query kvm->arch.enable_pmu prior to checking if the mediated PMU
> is enabled, so I doubt we even need a helper to check both.
>
> [*] https://lore.kernel.org/all/ZhhOEDAl6k-NzOkM@xxxxxxxxxx

Sure.


>
>>  
>>  	u32 notify_window;
>>  	u32 notify_vmexit_flags;
>> diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
>> index 4d52b0b539ba..cf93be5e7359 100644
>> --- a/arch/x86/kvm/pmu.h
>> +++ b/arch/x86/kvm/pmu.h
>> @@ -208,6 +208,20 @@ static inline void kvm_init_pmu_capability(const struct kvm_pmu_ops *pmu_ops)
>>  			enable_pmu = false;
>>  	}
>>  
>> +	/* Pass-through vPMU is only supported in Intel CPUs. */
>> +	if (!is_intel)
>> +		enable_passthrough_pmu = false;
>> +
>> +	/*
>> +	 * Pass-through vPMU requires at least PerfMon version 4 because the
>> +	 * implementation requires the usage of MSR_CORE_PERF_GLOBAL_STATUS_SET
>> +	 * for counter emulation as well as PMU context switch.  In addition, it
>> +	 * requires host PMU support on passthrough mode. Disable pass-through
>> +	 * vPMU if any condition fails.
>> +	 */
>> +	if (!enable_pmu || kvm_pmu_cap.version < 4 || !kvm_pmu_cap.passthrough)
> As is quite obvious by the end of the series, the v4 requirement is specific to
> Intel.
>
> 	if (!enable_pmu || !kvm_pmu_cap.passthrough ||
> 	    (is_intel && kvm_pmu_cap.version < 4) ||
> 	    (is_amd && kvm_pmu_cap.version < 2))
> 		enable_passthrough_pmu = false;
>
> Furthermore, there is zero reason to explicitly and manually check the vendor,
> kvm_init_pmu_capability() takes kvm_pmu_ops.  Adding a callback is somewhat
> undesirable as it would lead to duplicate code, but we can still provide separation
> of concerns by adding const variables to kvm_pmu_ops, a la MAX_NR_GP_COUNTERS.
>
> E.g.
>
> 	if (enable_pmu) {
> 		perf_get_x86_pmu_capability(&kvm_pmu_cap);
>
> 		/*
> 		 * WARN if perf did NOT disable hardware PMU if the number of
> 		 * architecturally required GP counters aren't present, i.e. if
> 		 * there are a non-zero number of counters, but fewer than what
> 		 * is architecturally required.
> 		 */
> 		if (!kvm_pmu_cap.num_counters_gp ||
> 		    WARN_ON_ONCE(kvm_pmu_cap.num_counters_gp < min_nr_gp_ctrs))
> 			enable_pmu = false;
> 		else if (pmu_ops->MIN_PMU_VERSION > kvm_pmu_cap.version)
> 			enable_pmu = false;
> 	}
>
> 	if (!enable_pmu || !kvm_pmu_cap.passthrough ||
> 	    pmu_ops->MIN_MEDIATED_PMU_VERSION > kvm_pmu_cap.version)
> 		enable_mediated_pmu = false;

Sure.  would do.


>> +		enable_passthrough_pmu = false;
>> +
>>  	if (!enable_pmu) {
>>  		memset(&kvm_pmu_cap, 0, sizeof(kvm_pmu_cap));
>>  		return;
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index ad465881b043..2ad122995f11 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -146,6 +146,8 @@ module_param_named(preemption_timer, enable_preemption_timer, bool, S_IRUGO);
>>  extern bool __read_mostly allow_smaller_maxphyaddr;
>>  module_param(allow_smaller_maxphyaddr, bool, S_IRUGO);
>>  
>> +module_param(enable_passthrough_pmu, bool, 0444);
> Hmm, we either need to put this param in kvm.ko, or move enable_pmu to vendor
> modules (or duplicate it there if we need to for backwards compatibility?).
>
> There are advantages to putting params in vendor modules, when it's safe to do so,
> e.g. it allows toggling the param when (re)loading a vendor module, so I think I'm
> supportive of having the param live in vendor code.  I just don't want to split
> the two PMU knobs.

Since enable_passthrough_pmu has already been in vendor modules,  we'd
better duplicate enable_pmu module parameter in vendor modules as the 1st step.


>
>>  #define KVM_VM_CR0_ALWAYS_OFF (X86_CR0_NW | X86_CR0_CD)
>>  #define KVM_VM_CR0_ALWAYS_ON_UNRESTRICTED_GUEST X86_CR0_NE
>>  #define KVM_VM_CR0_ALWAYS_ON				\
>> @@ -7924,7 +7926,8 @@ static __init u64 vmx_get_perf_capabilities(void)
>>  	if (boot_cpu_has(X86_FEATURE_PDCM))
>>  		rdmsrl(MSR_IA32_PERF_CAPABILITIES, host_perf_cap);
>>  
>> -	if (!cpu_feature_enabled(X86_FEATURE_ARCH_LBR)) {
>> +	if (!cpu_feature_enabled(X86_FEATURE_ARCH_LBR) &&
>> +	    !enable_passthrough_pmu) {
>>  		x86_perf_get_lbr(&vmx_lbr_caps);
>>  
>>  		/*
>> @@ -7938,7 +7941,7 @@ static __init u64 vmx_get_perf_capabilities(void)
>>  			perf_cap |= host_perf_cap & PMU_CAP_LBR_FMT;
>>  	}
>>  
>> -	if (vmx_pebs_supported()) {
>> +	if (vmx_pebs_supported() && !enable_passthrough_pmu) {
> Checking enable_mediated_pmu belongs in vmx_pebs_supported(), not in here,
> otherwise KVM will incorrectly advertise support to userspace:
>
> 	if (vmx_pebs_supported()) {
> 		kvm_cpu_cap_check_and_set(X86_FEATURE_DS);
> 		kvm_cpu_cap_check_and_set(X86_FEATURE_DTES64);
> 	}

Sure. Thanks for pointing this.


>
>>  		perf_cap |= host_perf_cap & PERF_CAP_PEBS_MASK;
>>  		/*
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index f1d589c07068..0c40f551130e 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -187,6 +187,10 @@ bool __read_mostly enable_pmu = true;
>>  EXPORT_SYMBOL_GPL(enable_pmu);
>>  module_param(enable_pmu, bool, 0444);
>>  
>> +/* Enable/disable mediated passthrough PMU virtualization */
>> +bool __read_mostly enable_passthrough_pmu;
>> +EXPORT_SYMBOL_GPL(enable_passthrough_pmu);
>> +
>>  bool __read_mostly eager_page_split = true;
>>  module_param(eager_page_split, bool, 0644);
>>  
>> @@ -6682,6 +6686,9 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>>  		mutex_lock(&kvm->lock);
>>  		if (!kvm->created_vcpus) {
>>  			kvm->arch.enable_pmu = !(cap->args[0] & KVM_PMU_CAP_DISABLE);
>> +			/* Disable passthrough PMU if enable_pmu is false. */
>> +			if (!kvm->arch.enable_pmu)
>> +				kvm->arch.enable_passthrough_pmu = false;
> And this code obviously goes away if the per-VM snapshot is removed.




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux