v4: * Correct compilation failure on PowerPC v3: https://lore.kernel.org/kvm/20240912205133.4171576-1-coltonlewis@xxxxxxxxxx/ v2: https://lore.kernel.org/kvm/20240911222433.3415301-1-coltonlewis@xxxxxxxxxx/ v1: https://lore.kernel.org/kvm/20240904204133.1442132-1-coltonlewis@xxxxxxxxxx/ This series cleans up perf recording around guest events and improves the accuracy of the resulting perf reports. Perf was incorrectly counting any PMU overflow interrupt that occurred while a VCPU was loaded as a guest event even when the events were not truely guest events. This lead to much less accurate and useful perf recordings. See as an example the below reports of `perf record dirty_log_perf_test -m 2 -v 4` before and after the series on ARM64. Without series: Samples: 15K of event 'instructions', Event count (approx.): 31830580924 Overhead Command Shared Object Symbol 54.54% dirty_log_perf_ dirty_log_perf_test [.] run_test 5.39% dirty_log_perf_ dirty_log_perf_test [.] vcpu_worker 0.89% dirty_log_perf_ [kernel.vmlinux] [k] release_pages 0.70% dirty_log_perf_ [kernel.vmlinux] [k] free_pcppages_bulk 0.62% dirty_log_perf_ dirty_log_perf_test [.] userspace_mem_region_find 0.49% dirty_log_perf_ dirty_log_perf_test [.] sparsebit_is_set 0.46% dirty_log_perf_ dirty_log_perf_test [.] _virt_pg_map 0.46% dirty_log_perf_ dirty_log_perf_test [.] node_add 0.37% dirty_log_perf_ dirty_log_perf_test [.] node_reduce 0.35% dirty_log_perf_ [kernel.vmlinux] [k] free_unref_page_commit 0.33% dirty_log_perf_ [kernel.vmlinux] [k] __kvm_pgtable_walk 0.31% dirty_log_perf_ [kernel.vmlinux] [k] stage2_attr_walker 0.29% dirty_log_perf_ [kernel.vmlinux] [k] unmap_page_range 0.29% dirty_log_perf_ dirty_log_perf_test [.] test_assert 0.26% dirty_log_perf_ [kernel.vmlinux] [k] __mod_memcg_lruvec_state 0.24% dirty_log_perf_ [kernel.vmlinux] [k] kvm_s2_put_page With series: Samples: 15K of event 'instructions', Event count (approx.): 31830580924 Samples: 15K of event 'instructions', Event count (approx.): 30898031385 Overhead Command Shared Object Symbol 54.05% dirty_log_perf_ dirty_log_perf_test [.] run_test 5.48% dirty_log_perf_ [kernel.kallsyms] [k] kvm_arch_vcpu_ioctl_run 4.70% dirty_log_perf_ dirty_log_perf_test [.] vcpu_worker 3.11% dirty_log_perf_ [kernel.kallsyms] [k] kvm_handle_guest_abort 2.24% dirty_log_perf_ [kernel.kallsyms] [k] up_read 1.98% dirty_log_perf_ [kernel.kallsyms] [k] __kvm_tlb_flush_vmid_ipa_nsh 1.97% dirty_log_perf_ [kernel.kallsyms] [k] __pi_clear_page 1.30% dirty_log_perf_ [kernel.kallsyms] [k] down_read 1.13% dirty_log_perf_ [kernel.kallsyms] [k] release_pages 1.12% dirty_log_perf_ [kernel.kallsyms] [k] __kvm_pgtable_walk 1.08% dirty_log_perf_ [kernel.kallsyms] [k] folio_batch_move_lru 1.06% dirty_log_perf_ [kernel.kallsyms] [k] __srcu_read_lock 1.03% dirty_log_perf_ [kernel.kallsyms] [k] get_page_from_freelist 1.01% dirty_log_perf_ [kernel.kallsyms] [k] __pte_offset_map_lock 0.82% dirty_log_perf_ [kernel.kallsyms] [k] handle_mm_fault 0.74% dirty_log_perf_ [kernel.kallsyms] [k] mas_state_walk Colton Lewis (5): arm: perf: Drop unused functions perf: Hoist perf_instruction_pointer() and perf_misc_flags() powerpc: perf: Use perf_arch_instruction_pointer() x86: perf: Refactor misc flag assignments perf: Correct perf sampling with guest VMs arch/arm/include/asm/perf_event.h | 7 --- arch/arm/kernel/perf_callchain.c | 17 ------- arch/arm64/include/asm/perf_event.h | 4 -- arch/arm64/kernel/perf_callchain.c | 28 ------------ arch/powerpc/include/asm/perf_event_server.h | 6 +-- arch/powerpc/perf/callchain.c | 2 +- arch/powerpc/perf/callchain_32.c | 2 +- arch/powerpc/perf/callchain_64.c | 2 +- arch/powerpc/perf/core-book3s.c | 4 +- arch/s390/include/asm/perf_event.h | 6 +-- arch/s390/kernel/perf_event.c | 4 +- arch/x86/events/core.c | 47 +++++++++++--------- arch/x86/include/asm/perf_event.h | 12 ++--- include/linux/perf_event.h | 26 +++++++++-- kernel/events/core.c | 27 ++++++++++- 15 files changed, 95 insertions(+), 99 deletions(-) base-commit: da3ea35007d0af457a0afc87e84fddaebc4e0b63 -- 2.46.0.792.g87dc391469-goog