On 2023-11-08 2:35 p.m., Sean Christopherson wrote: > On Wed, Nov 08, 2023, Kan Liang wrote: >> On 2023-11-07 7:31 p.m., Sean Christopherson wrote: >>> @@ -442,8 +396,29 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) >>> return 0; >>> } >>> >>> +/* >>> + * Map fixed counter events to architectural general purpose event encodings. >>> + * Perf doesn't provide APIs to allow KVM to directly program a fixed counter, >>> + * and so KVM instead programs the architectural event to effectively request >>> + * the fixed counter. Perf isn't guaranteed to use a fixed counter and may >>> + * instead program the encoding into a general purpose counter, e.g. if a >>> + * different perf_event is already utilizing the requested counter, but the end >>> + * result is the same (ignoring the fact that using a general purpose counter >>> + * will likely exacerbate counter contention). >>> + * >>> + * Note, reference cycles is counted using a perf-defined "psuedo-encoding", >>> + * as there is no architectural general purpose encoding for reference cycles. >> >> It's not the case for the latest Intel platforms anymore. Please see >> ffbe4ab0beda ("perf/x86/intel: Extend the ref-cycles event to GP counters"). > > Ugh, yeah. But that and should actually be easier to do on top. > >> Maybe perf should export .event_map to KVM somehow. > > Oh for ***** sake, perf already does export this for KVM. Untested, but the below > should do the trick. If I need to spin another version of this series then I'll > fold it in, otherwise I'll post it as something on top. > > There's also an optimization to be had for kvm_pmu_trigger_event(), which incurs > an indirect branch not only every invocation, but on every iteration. I'll post > this one separately. > > diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c > index 5fc5a62af428..a02e13c2e5e6 100644 > --- a/arch/x86/kvm/vmx/pmu_intel.c > +++ b/arch/x86/kvm/vmx/pmu_intel.c > @@ -405,25 +405,32 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > * different perf_event is already utilizing the requested counter, but the end > * result is the same (ignoring the fact that using a general purpose counter > * will likely exacerbate counter contention). > - * > - * Note, reference cycles is counted using a perf-defined "psuedo-encoding", > - * as there is no architectural general purpose encoding for reference cycles. > */ > static u64 intel_get_fixed_pmc_eventsel(int index) > { > - const struct { > - u8 eventsel; > - u8 unit_mask; > - } fixed_pmc_events[] = { > - [0] = { 0xc0, 0x00 }, /* Instruction Retired / PERF_COUNT_HW_INSTRUCTIONS. */ > - [1] = { 0x3c, 0x00 }, /* CPU Cycles/ PERF_COUNT_HW_CPU_CYCLES. */ > - [2] = { 0x00, 0x03 }, /* Reference Cycles / PERF_COUNT_HW_REF_CPU_CYCLES*/ > + enum perf_hw_id perf_id; > + u64 eventsel; > + > + BUILD_BUG_ON(KVM_PMC_MAX_FIXED != 3); > + > + switch (index) { > + case 0: > + perf_id = PERF_COUNT_HW_INSTRUCTIONS; > + break; > + case 1: > + perf_id = PERF_COUNT_HW_CPU_CYCLES; > + break; > + case 2: > + perf_id = PERF_COUNT_HW_REF_CPU_CYCLES; > + break; > + default: > + WARN_ON_ONCE(1); > + return 0; > }; > > - BUILD_BUG_ON(ARRAY_SIZE(fixed_pmc_events) != KVM_PMC_MAX_FIXED); > - > - return (fixed_pmc_events[index].unit_mask << 8) | > - fixed_pmc_events[index].eventsel; > + eventsel = perf_get_hw_event_config(perf_id); Yes, the perf_get_hw_event_config() can tell the updated event encoding. Thanks, Kan > + WARN_ON_ONCE(!eventsel && index < kvm_pmu_cap.num_counters_fixed); > + return eventsel; > } > > static void intel_pmu_refresh(struct kvm_vcpu *vcpu)