On Thu, Sep 14, 2023, Like Xu wrote: > On 7/4/2023 11:37 pm, Sean Christopherson wrote: > > On Fri, Apr 07, 2023, Like Xu wrote: > /* > * The guest vPMU counter emulation depends on the EVENTSEL_GUESTONLY bit. > * If this bit is present on the host, the host needs to support at least > the PERFCTR_CORE. > */ ... > > /* > > * KVM requires guest-only event support in order to isolate guest PMCs > > * from host PMCs. SVM doesn't provide a way to atomically load MSRs > > * on VMRUN, and manually adjusting counts before/after VMRUN is not > > * accurate enough to properly virtualize a PMU. > > */ > > > > But now I'm really confused, because if I'm reading the code correctly, perf > > invokes amd_core_hw_config() for legacy PMUs, i.e. even if PERFCTR_CORE isn't > > supported. And the APM documents the host/guest bits only for "Core Performance > > Event-Select Registers". > > > > So either (a) GUESTONLY isn't supported on legacy CPUs and perf is relying on AMD > > CPUs ignoring reserved bits or (b) GUESTONLY _is_ supported on legacy PMUs and > > pmu_has_guestonly_mode() is checking the wrong MSR when running on older CPUs. > > > > And if (a) is true, then how on earth does KVM support vPMU when running on a > > legacy PMU? Is vPMU on AMD just wildly broken? Am I missing something? > > > > (a) It's true and AMD guest vPMU have only been implemented accurately with > the help of this GUESTONLY bit. > > There are two other scenarios worth discussing here: one is support L2 vPMU > on the PERFCTR_CORE+ host and this proposal is disabling it; and the other > case is to support AMD legacy vPMU on the PERFCTR_CORE+ host. Oooh, so the really problematic case is when PERFCTR_CORE+ is supported but GUESTONLY is not, in which case KVM+perf *think* they can use GUESTONLY (and HOSTONLY). That's a straight up KVM (as L0) bug, no? I don't see anything in the APM that suggests those bits are optional, i.e. KVM is blatantly violating AMD's architecture by ignoring those bits. I would rather fix KVM (as L0). It doesn't seem _that_ hard to support, e.g. modify reprogram_counter() to disable the counter if it's supposed to be silent for the current mode, and reprogram all counters if EFER.SVME is toggled, and on all nested transitions.