On Wed, Nov 08, 2023, Kan Liang wrote: > On 2023-11-07 7:31 p.m., Sean Christopherson wrote: > > @@ -442,8 +396,29 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > > return 0; > > } > > > > +/* > > + * Map fixed counter events to architectural general purpose event encodings. > > + * Perf doesn't provide APIs to allow KVM to directly program a fixed counter, > > + * and so KVM instead programs the architectural event to effectively request > > + * the fixed counter. Perf isn't guaranteed to use a fixed counter and may > > + * instead program the encoding into a general purpose counter, e.g. if a > > + * different perf_event is already utilizing the requested counter, but the end > > + * result is the same (ignoring the fact that using a general purpose counter > > + * will likely exacerbate counter contention). > > + * > > + * Note, reference cycles is counted using a perf-defined "psuedo-encoding", > > + * as there is no architectural general purpose encoding for reference cycles. > > It's not the case for the latest Intel platforms anymore. Please see > ffbe4ab0beda ("perf/x86/intel: Extend the ref-cycles event to GP counters"). Ugh, yeah. But that and should actually be easier to do on top. > Maybe perf should export .event_map to KVM somehow. Oh for ***** sake, perf already does export this for KVM. Untested, but the below should do the trick. If I need to spin another version of this series then I'll fold it in, otherwise I'll post it as something on top. There's also an optimization to be had for kvm_pmu_trigger_event(), which incurs an indirect branch not only every invocation, but on every iteration. I'll post this one separately. diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 5fc5a62af428..a02e13c2e5e6 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -405,25 +405,32 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) * different perf_event is already utilizing the requested counter, but the end * result is the same (ignoring the fact that using a general purpose counter * will likely exacerbate counter contention). - * - * Note, reference cycles is counted using a perf-defined "psuedo-encoding", - * as there is no architectural general purpose encoding for reference cycles. */ static u64 intel_get_fixed_pmc_eventsel(int index) { - const struct { - u8 eventsel; - u8 unit_mask; - } fixed_pmc_events[] = { - [0] = { 0xc0, 0x00 }, /* Instruction Retired / PERF_COUNT_HW_INSTRUCTIONS. */ - [1] = { 0x3c, 0x00 }, /* CPU Cycles/ PERF_COUNT_HW_CPU_CYCLES. */ - [2] = { 0x00, 0x03 }, /* Reference Cycles / PERF_COUNT_HW_REF_CPU_CYCLES*/ + enum perf_hw_id perf_id; + u64 eventsel; + + BUILD_BUG_ON(KVM_PMC_MAX_FIXED != 3); + + switch (index) { + case 0: + perf_id = PERF_COUNT_HW_INSTRUCTIONS; + break; + case 1: + perf_id = PERF_COUNT_HW_CPU_CYCLES; + break; + case 2: + perf_id = PERF_COUNT_HW_REF_CPU_CYCLES; + break; + default: + WARN_ON_ONCE(1); + return 0; }; - BUILD_BUG_ON(ARRAY_SIZE(fixed_pmc_events) != KVM_PMC_MAX_FIXED); - - return (fixed_pmc_events[index].unit_mask << 8) | - fixed_pmc_events[index].eventsel; + eventsel = perf_get_hw_event_config(perf_id); + WARN_ON_ONCE(!eventsel && index < kvm_pmu_cap.num_counters_fixed); + return eventsel; } static void intel_pmu_refresh(struct kvm_vcpu *vcpu)