On Thu, Aug 01, 2024, Mingwei Zhang wrote: > Introduce a vendor specific API to check if rdpmc passthrough allowed. > RDPMC passthrough requires guest VM have the full ownership of all > counters. These include general purpose counters and fixed counters and > some vendor specific MSRs such as PERF_METRICS. Since PERF_METRICS MSR is > Intel specific, putting the check into vendor specific code. > > Signed-off-by: Mingwei Zhang <mizhang@xxxxxxxxxx> > Tested-by: Yongwei Ma <yongwei.ma@xxxxxxxxx> > --- > arch/x86/include/asm/kvm-x86-pmu-ops.h | 1 + > arch/x86/kvm/pmu.c | 1 + > arch/x86/kvm/pmu.h | 1 + > arch/x86/kvm/svm/pmu.c | 6 ++++++ > arch/x86/kvm/vmx/pmu_intel.c | 16 ++++++++++++++++ > 5 files changed, 25 insertions(+) > > diff --git a/arch/x86/include/asm/kvm-x86-pmu-ops.h b/arch/x86/include/asm/kvm-x86-pmu-ops.h > index f852b13aeefe..fd986d5146e4 100644 > --- a/arch/x86/include/asm/kvm-x86-pmu-ops.h > +++ b/arch/x86/include/asm/kvm-x86-pmu-ops.h > @@ -20,6 +20,7 @@ KVM_X86_PMU_OP(get_msr) > KVM_X86_PMU_OP(set_msr) > KVM_X86_PMU_OP(refresh) > KVM_X86_PMU_OP(init) > +KVM_X86_PMU_OP(is_rdpmc_passthru_allowed) > KVM_X86_PMU_OP_OPTIONAL(reset) > KVM_X86_PMU_OP_OPTIONAL(deliver_pmi) > KVM_X86_PMU_OP_OPTIONAL(cleanup) > diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c > index 19104e16a986..3afefe4cf6e2 100644 > --- a/arch/x86/kvm/pmu.c > +++ b/arch/x86/kvm/pmu.c > @@ -102,6 +102,7 @@ bool kvm_pmu_check_rdpmc_passthrough(struct kvm_vcpu *vcpu) > > if (is_passthrough_pmu_enabled(vcpu) && > !enable_vmware_backdoor && > + static_call(kvm_x86_pmu_is_rdpmc_passthru_allowed)(vcpu) && If the polarity is inverted, the callback can be OPTIONAL_RET0 on AMD. E.g. if (kvm_pmu_call(rdpmc_needs_intercept(vcpu))) return false; > +static bool intel_is_rdpmc_passthru_allowed(struct kvm_vcpu *vcpu) > +{ > + /* > + * Per Intel SDM vol. 2 for RDPMC, Please don't reference specific sections in the comments. For changelogs it's ok, because changelogs are a snapshot in time. But comments are living things and will become stale in almost every case. And I don't see any reason to reference the SDM, just state the behavior; it's implied that that's the architectural behavior, otherwise KVM is buggy. > MSR_PERF_METRICS is accessible by This is technically wrong, the SDM states that the RDPMC behavior is implementation specific. That matters to some extent, because if it was _just_ one MSR and was guaranteed to always be that one MSR, it might be worth creating a virtualization hole. /* * Intercept RDPMC if the host supports PERF_METRICS, but the guest * does not, as RDPMC with type 0x2000 accesses implementation specific * metrics. */ All that said, isn't this redundant with the number of fixed counters? I'm having a hell of a time finding anything concrete in the SDM, but IIUC fixed counter 3 is tightly coupled to perf metrics. E.g. rather than add a vendor hook just for this, rely on the fixed counters and refuse to enable the mediated PMU if the underlying CPU model is nonsensical, i.e. perf metrics exists without ctr3. And I kinda think we have to go that route, because enabling RDPMC interception based on future features is doomed from the start. E.g. if this code had been written prior to PERF_METRICS, older KVMs would have zero clue that RDPMC needs to be intercepted on newer hardware.