On Fri, Nov 11, 2022, Like Xu wrote: On Fri, Nov 11, 2022, Like Xu wrote: > @@ -162,20 +179,42 @@ static int amd_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > static void amd_pmu_refresh(struct kvm_vcpu *vcpu) > { > struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); > + struct kvm_cpuid_entry2 *entry; > + union cpuid_0x80000022_ebx ebx; > > - if (guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE)) > - pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS_CORE; > + pmu->version = 1; > + if (kvm_cpu_cap_has(X86_FEATURE_AMD_PMU_V2) && Why check kvm_cpu_cap support? I.e. what will go wrong if userspace enumerates PMU v2 to the guest without proper hardware/KVM support. If this is _necessary_ to protect the host kernel, then we should probably have a helper to query PMU features, e.g. static __always_inline bool guest_pmu_has(struct kvm_vcpu *vcpu, unsigned int x86_feature) { return kvm_cpu_cap_has(x86_feature) && guest_cpuid_has(vcpu, x86_feature); } > + guest_cpuid_has(vcpu, X86_FEATURE_AMD_PMU_V2)) { > + pmu->version = 2; > + entry = kvm_find_cpuid_entry_index(vcpu, 0x80000022, 0); > + ebx.full = entry->ebx; > + pmu->nr_arch_gp_counters = min3((unsigned int)ebx.split.num_core_pmc, > + (unsigned int)kvm_pmu_cap.num_counters_gp, > + (unsigned int)KVM_AMD_PMC_MAX_GENERIC); Blech. This really shouldn't be necessary, KVM should tweak kvm_pmu_cap.num_counters_gp as needed during initialization to ensure num_counters_gp doesn't exceed KVM's internal limits. Posted a patch[*], please take a look. As mentioned in that thread, I'll somewhat speculatively apply that series sooner than later so that you can use it a base for this series (assuming the patch isn't busted). [*] https://lore.kernel.org/all/20230124234905.3774678-2-seanjc@xxxxxxxxxx > + } > + > + /* Commitment to minimal PMCs, regardless of CPUID.80000022 */ Please expand this comment. I'm still not entirely sure I've interpreted it correctly, and I'm not sure that I agree with the code. > + if (kvm_cpu_cap_has(X86_FEATURE_PERFCTR_CORE) && AFAICT, checking kvm_cpu_cap_has() is an unrelated change. Either it's a bug fix and belongs in a separate patch, or it's unnecessary and should be dropped. > + guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE)) > + pmu->nr_arch_gp_counters = max_t(unsigned int, > + pmu->nr_arch_gp_counters, > + AMD64_NUM_COUNTERS_CORE); > else > - pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS; > + pmu->nr_arch_gp_counters = max_t(unsigned int, > + pmu->nr_arch_gp_counters, > + AMD64_NUM_COUNTERS); Using max() doesn't look right. E.g. if KVM ends up running on some odd setup where ebx.split.num_core_pmc/kvm_pmu_cap.num_counters_gp is less than AMD64_NUM_COUNTERS_CORE or AMD64_NUM_COUNTERS. Or more likely, if userspace says "only expose N counters to this guest". Shouldn't this be something like? if (guest_cpuid_has(vcpu, X86_FEATURE_AMD_PMU_V2)) pmu->nr_arch_gp_counters = min(ebx.split.num_core_pmc, kvm_pmu_cap.num_counters_gp); else if (guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE)) pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS_CORE; else pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERSE;