On Thu, Aug 01, 2024, Mingwei Zhang wrote: > From: Sandipan Das <sandipan.das@xxxxxxx> > > Currently, the global control bits for a vcpu are restored to the reset > state only if the guest PMU version is less than 2. This works for > emulated PMU as the MSRs are intercepted and backing events are created > for and managed by the host PMU [1]. > > If such a guest in run with passthrough PMU, the counters no longer work > because the global enable bits are cleared. Hence, set the global enable > bits to their reset state if passthrough PMU is used. > > A passthrough-capable host may not necessarily support PMU version 2 and > it can choose to restore or save the global control state from struct > kvm_pmu in the PMU context save and restore helpers depending on the > availability of the global control register. > > [1] 7b46b733bdb4 ("KVM: x86/pmu: Set enable bits for GP counters in PERF_GLOBAL_CTRL at "RESET""); > > Reported-by: Mingwei Zhang <mizhang@xxxxxxxxxx> > Signed-off-by: Sandipan Das <sandipan.das@xxxxxxx> > [removed the fixes tag] > Signed-off-by: Mingwei Zhang <mizhang@xxxxxxxxxx> > --- > arch/x86/kvm/pmu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c > index 5768ea2935e9..e656f72fdace 100644 > --- a/arch/x86/kvm/pmu.c > +++ b/arch/x86/kvm/pmu.c > @@ -787,7 +787,7 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu) > * in the global controls). Emulate that behavior when refreshing the > * PMU so that userspace doesn't need to manually set PERF_GLOBAL_CTRL. > */ > - if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters) > + if ((pmu->passthrough || kvm_pmu_has_perf_global_ctrl(pmu)) && pmu->nr_arch_gp_counters) > pmu->global_ctrl = GENMASK_ULL(pmu->nr_arch_gp_counters - 1, 0); This is wrong and confusing. From the guest's perspective, and therefore from host userspace's perspective, PERF_GLOBAL_CTRL does not exist. Therefore, the value that is tracked for the guest must be '0'. I see that intel_passthrough_pmu_msrs() and amd_passthrough_pmu_msrs() intercept accesses to PERF_GLOBAL_CTRL if "pmu->version > 1" (which, by the by, needs to be kvm_pmu_has_perf_global_ctrl()), so there's no weirdness with the guest being able to access MSRs that shouldn't exist. But KVM shouldn't stuff pmu->global_ctrl, and doing so is a symptom of another flaw. Unless I'm missing something, KVM stuffs pmu->global_ctrl so that the correct value is loaded on VM-Enter, but loading and saving PERF_GLOBAL_CTRL on entry/exit is unnecessary and confusing, as is loading the associated MSRs when restoring (loading) the guest context. For PERF_GLOBAL_CTRL on Intel, KVM needs to ensure all GP counters are enabled in VMCS.GUEST_IA32_PERF_GLOBAL_CTRL, but that's a "set once and forget" operation, not something that needs to be done on every entry and exit. Of course, loading and saving PERF_GLOBAL_CTRL on every entry/exit is unnecessary for other reasons, but that's largely orthogonal. On AMD, amd_restore_pmu_context()[*] needs to enable a maximal value for PERF_GLOBAL_CTRL, but I don't think there's any need to load the other MSRs, and the maximal value should come from the above logic, not pmu->global_ctrl. [*] Side topic, in case I forget later, that API should be "load", not "restore". There is no assumption or guarantee that KVM is exactly restoring anything, e.g. if PERF_GLOBAL_CTRL doesn't exist in the guest PMU and on the first load.