Hi, The Ice Lake core PMU provides built-in support for Top-down u-arch Analysis (TMA) method level 1 metrics. These metrics are always available to cross-validate performance observations, freeing general purpose counters to count other events in high counter utilization scenarios. For more details about the method, refer to Top-Down Analysis Method chapter (Appendix B.1) of the Intel® 64 and IA-32 Architectures Optimization Reference Manual. (SDM 19.3.9.3 Performance Metrics) This patchset enables Intel Guest Topdow for KVM-based guests. Its basic enabling framework remains unchanged, a perf_metric msr is introduced, a group (rather than one) of perf_events is created in KVM by binding to fiexed counter3 to obtain hardware resources, and the guest value of perf_metric msr is assembled based on the count of grouped perf_events. On KVM, patches 0004/5/6 may be reviewd independently if KVM only enable fixed counter3 as normal slot event for count and sampling. Patch 7 updates the infrastructure for creating grouped events in KVM, and patch 8 uses group events to emulate guest MSR_PERF_METRICS. On Perf, Patches 0001-0003 are awaiting review for tip/perf/core, and could be accepted separately if they make sense. TBH, I don't think our perf/core is fully prepared to support kernel space grouped counters, considering comments around perf_enable_diasable(). But after much exploration on my part, this is probably the most promising way to get KVM to create slots plus metrics events. If the addition of *group_leader messes things up, please shout at me on your needs. More details in each commit messages may answer code-related questions. A classic perf tool usage on a linux guest is as follows: $ perf stat --topdown --td-level=1 -I1000 --no-metric-only sleep 1 # time counts unit events 1.000548528 34,505,682 slots 1.000548528 14,208,222 topdown-retiring # 41.5% Retiring 1.000548528 1,623,796 topdown-bad-spec # 4.7% Bad Speculation 1.000548528 14,614,171 topdown-fe-bound # 42.7% Frontend Bound 1.000548528 3,788,859 topdown-be-bound # 11.1% Backend Bound Related KUT will follow if there are no obstructive negative comments. Nit, pre-patches includes: https://lore.kernel.org/kvm/20221207071506.15733-2-likexu@xxxxxxxxxxx/ https://lore.kernel.org/kvm/20221205122048.16023-1-likexu@xxxxxxxxxxx/ Please feel free to comment and share your feedback. Thanks, Like Xu (8): perf/core: Add *group_leader to perf_event_create_kernel_counter() perf: x86/core: Expose the available number of the Topdown metrics perf: x86/core: Snyc PERF_METRICS bit together with fixed counter3 KVM: x86/pmu: Add Intel CPUID-hinted Topdown Slots event KVM: x86/pmu: Add kernel-defined slots event to enable Fixed Counter3 KVM: x86/pmu: properly use INTEL_PMC_FIXED_RDPMC_BASE macro KVM: x86/pmu: Use flex *event arrays to implement grouped events KVM: x86/pmu: Add MSR_PERF_METRICS MSR emulation to enable Topdown arch/arm64/kvm/pmu-emul.c | 4 +- arch/x86/events/core.c | 1 + arch/x86/events/intel/core.c | 3 + arch/x86/include/asm/kvm_host.h | 14 +- arch/x86/include/asm/msr-index.h | 1 + arch/x86/include/asm/perf_event.h | 1 + arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 4 +- arch/x86/kvm/pmu.c | 149 ++++++++++++++++++++-- arch/x86/kvm/pmu.h | 31 +++-- arch/x86/kvm/svm/pmu.c | 1 + arch/x86/kvm/vmx/pmu_intel.c | 53 +++++++- arch/x86/kvm/vmx/vmx.c | 3 + arch/x86/kvm/x86.c | 9 +- include/linux/perf_event.h | 1 + kernel/events/core.c | 4 +- kernel/events/hw_breakpoint.c | 4 +- kernel/events/hw_breakpoint_test.c | 2 +- kernel/watchdog_hld.c | 2 +- 18 files changed, 239 insertions(+), 48 deletions(-) -- 2.38.2