On Fri, Nov 3, 2023 at 5:02 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > Now that KVM hides fixed counters that can't be virtualized, treat fixed > counters as available when they are supported, i.e. don't silently ignore > an enabled fixed counter just because guest CPUID says the associated > general purpose architectural event is unavailable. > > KVM originally treated fixed counters as always available, but that got > changed as part of a fix to avoid confusing REF_CPU_CYCLES, which does NOT > map to an architectural event, with the actual architectural event used > associated with bit 7, TOPDOWN_SLOTS. > > The commit justified the change with: > > If the event is marked as unavailable in the Intel guest CPUID > 0AH.EBX leaf, we need to avoid any perf_event creation, whether > it's a gp or fixed counter. > > but that justification doesn't mesh with reality. The Intel SDM uses > "architectural events" to refer to both general purpose events (the ones > with the reverse polarity mask in CPUID.0xA.EBX) and the events for fixed > counters, e.g. the SDM makes statements like: > > Each of the fixed-function PMC can count only one architectural > performance event. > > but the fact that fixed counter 2 (TSC reference cycles) doesn't have an > associated general purpose architectural makes trying to apply the mask > from CPUID.0xA.EBX impossible. Furthermore, the SDM never explicitly > says that an architectural events that's marked unavailable in EBX affects > the fixed counters. > > Note, at the time of the change, KVM didn't enforce hardware support, i.e. > didn't prevent userspace from enumerating support in guest CPUID.0xA.EBX > for architectural events that aren't supported in hardware. I.e. silently > dropping the fixed counter didn't somehow protection against counting the > wrong event, it just enforced guest CPUID. > > Arguably, userspace is creating a bogus vCPU model by advertising a fixed > counter but saying the associated general purpose architectural event is > unavailable. But regardless of the validity of the vCPU model, letting > the guest enable a fixed counter and then not actually having it count > anything is completely nonsensical. I.e. even if all of the above is > wrong and it's illegal for a fixed counter to exist when the architectural > event is unavailable, silently doing nothing is still the wrong behavior > and KVM should instead disallow enabling the fixed counter in the first > place. > > Fixes: a21864486f7e ("KVM: x86/pmu: Fix available_event_types check for REF_CPU_CYCLES event") > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> > --- > arch/x86/kvm/vmx/pmu_intel.c | 15 ++++++++++++++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c > index 8d545f84dc4a..b239e7dbdc9b 100644 > --- a/arch/x86/kvm/vmx/pmu_intel.c > +++ b/arch/x86/kvm/vmx/pmu_intel.c > @@ -147,11 +147,24 @@ static bool intel_hw_event_available(struct kvm_pmc *pmc) As discussed (https://lore.kernel.org/kvm/ZUU12-TUR_1cj47u@xxxxxxxxxx/), this function should go away. > u8 unit_mask = (pmc->eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8; > int i; > > + /* > + * Fixed counters are always available if KVM reaches this point. If a > + * fixed counter is unsupported in hardware or guest CPUID, KVM doesn't > + * allow the counter's corresponding MSR to be written. KVM does use > + * architectural events to program fixed counters, as the interface to > + * perf doesn't allow requesting a specific fixed counter, e.g. perf > + * may (sadly) back a guest fixed PMC with a general purposed counter. > + * But if _hardware_ doesn't support the associated event, KVM simply > + * doesn't enumerate support for the fixed counter. > + */ > + if (pmc_is_fixed(pmc)) > + return true; > + > BUILD_BUG_ON(ARRAY_SIZE(intel_arch_events) != NR_INTEL_ARCH_EVENTS); > > /* > * Disallow events reported as unavailable in guest CPUID. Note, this > - * doesn't apply to pseudo-architectural events. > + * doesn't apply to pseudo-architectural events (see above). > */ > for (i = 0; i < NR_REAL_INTEL_ARCH_EVENTS; i++) { > if (intel_arch_events[i].eventsel != event_select || > -- > 2.42.0.869.gea05f2083d-goog >