On Thu, Nov 9, 2023 at 10:24 AM Konstantin Khorenko <khorenko@xxxxxxxxxxxxx> wrote: > > The following 2 mainstream patches have introduced extra > events accounting: > > 018d70ffcfec ("KVM: x86: Update vPMCs when retiring branch instructions") > 9cd803d496e7 ("KVM: x86: Update vPMCs when retiring instructions") > > kvm_pmu_trigger_event() iterates over all PMCs looking for enabled and > this appeared to be fast on Intel CPUs and quite expensive for AMD CPUs. > > kvm_pmu_trigger_event() can be optimized not to iterate over all PMCs in > the following cases: > > * if PMU is completely disabled for a VM, which is the default > configuration > * if PMU v2 is enabled, but no PMCs are configured > > For Intel CPUs: > * By default PMU is disabled for KVM VMs (<pmu state='off'/> or absent > in the VM config xml which results in "-cpu pmu=off" qemu option). > In this case pmu->version is reported as 0 for the appropriate vCPU. > > * According to Intel® 64 and IA-32 Architectures Software Developer’s > Manual PMU version 2 and higher provide IA32_PERF_GLOBAL_CTRL MSR > which in particular contains bits which can be used for efficient > detection which fixed-function performance and general-purpose > performance monitoring counters are enabled at the moment. > > * Searching for enabled PMCs is fast and the optimization does not > bring noticeable performance increase. > > For AMD CPUs: > * For CPUs older than Zen 4 pmu->version is always reported as "1" for > the appropriate vCPU, no matter if PMU is disabled for KVM VMs > (<pmu state='off'/>) or enabled. > So for "old" CPUs currently it's impossible to detect when PMU is > disabled for a VM and skip the iteration by PMCs efficiently. > > * Since Zen 4 AMD CPUs support PMU v2 and in this case pmu->version > should be reported as "2" and IA32_PERF_GLOBAL_CTRL MSR is available > and can be used for fast and efficient check for enabled PMCs. > https://www.phoronix.com/news/AMD-PerfMonV2-Linux-Patches > https://www.phoronix.com/news/AMD-PerfMonV2-Guests-KVM > > * Optimized preliminary check for enabled PMCs on AMD Zen 4 CPUs > should give quite noticeable performance improvement. > > AMD performance results: > CPU: AMD Zen 3 (three!): AMD EPYC 7443P 24-Core Processor > > * The test binary is run inside an AlmaLinux 9 VM with their stock kernel > 5.14.0-284.11.1.el9_2.x86_64. > * Test binary checks the CPUID instractions rate (instructions per sec). > * Default VM config (PMU is off, pmu->version is reported as 1). > * The Host runs the kernel under test. > > # for i in 1 2 3 4 5 ; do ./at_cpu_cpuid.pub ; done | \ > awk -e '{print $4;}' | \ > cut -f1 --delimiter='.' | \ > ./avg.sh > > Measurements: > 1. Host runs stock latest mainstream kernel commit 305230142ae0. > 2. Host runs same mainstream kernel + current patch. > 3. Host runs same mainstream kernel + current patch + force > guest_pmu_is_enabled() to always return "false" using following change: > > - if (pmu->version >= 2 && !(pmu->global_ctrl & ~pmu->global_ctrl_mask)) > + if (pmu->version == 1 && !(pmu->global_ctrl & ~pmu->global_ctrl_mask)) > > -------------------------------------- > | Kernels | CPUID rate | > -------------------------------------- > | 1. | 1360250 | > | 2. | 1365536 (+ 0.4%) | > | 3. | 1541850 (+13.4%) | > -------------------------------------- > > Measurement (2) gives some fluctuation, the performance is not increased > because the test was done on a Zen 3 CPU, so we are unable to use fast > check for active PMCs. > Measurement (3) shows expected performance boost on a Zen 4 CPU under > the same test. > > Signed-off-by: Konstantin Khorenko <khorenko@xxxxxxxxxxxxx> > --- > arch/x86/kvm/pmu.c | 26 ++++++++++++++++++++++++++ > 1 file changed, 26 insertions(+) > > diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c > index 9ae07db6f0f6..290d407f339b 100644 > --- a/arch/x86/kvm/pmu.c > +++ b/arch/x86/kvm/pmu.c > @@ -731,12 +731,38 @@ static inline bool cpl_is_matched(struct kvm_pmc *pmc) > return (static_call(kvm_x86_get_cpl)(pmc->vcpu) == 0) ? select_os : select_user; > } > > +static inline bool guest_pmu_is_enabled(struct kvm_pmu *pmu) > +{ > + /* > + * Currently VMs do not have PMU settings in configs which defaults > + * to "pmu=off". > + * > + * For Intel currently this means pmu->version will be 0. > + * For AMD currently PMU cannot be disabled: Isn't that what KVM_PMU_CAP_DISABLE is for?