On Fri, Oct 14, 2022, Like Xu wrote: > For subject title, the "reprogram" bit is _only_ used to keep track of > pmc->perf_event, > not whether the counter is disabled. > > On 23/9/2022 8:13 am, Sean Christopherson wrote: > > When reprogramming a counter, clear the counter's "reprogram pending" bit > > if the counter is disabled (by the guest) or is disallowed (by the > > userspace filter). In both cases, there's no need to re-attempt > > programming on the next coincident KVM_REQ_PMU as enabling the counter by > > either method will trigger reprogramming. > > Perhaps we could move the check_pmu_event_filter() towards the top of the > call stack. Top of what call stack exactly? reprogram_counter() has multiple callers, and the filter check is already near the top of reprogram_counter(). > > @@ -245,7 +245,6 @@ static bool pmc_resume_counter(struct kvm_pmc *pmc) > > perf_event_enable(pmc->perf_event); > > pmc->is_paused = false; > > - clear_bit(pmc->idx, (unsigned long *)&pmc_to_pmu(pmc)->reprogram_pmi); > > This change is very suspicious. In the current code, pmc_resume_counter() clears the bit iff it returns true. With this patch, reprogram_counter() is guarnteed to clear the bit if pmc_resume_counter() returns true. if (pmc->current_config == new_config && pmc_resume_counter(pmc)) goto reprogram_complete; pmc_release_perf_event(pmc); pmc->current_config = new_config; /* * If reprogramming fails, e.g. due to contention, leave the counter's * regprogram bit set, i.e. opportunistically try again on the next PMU * refresh. Don't make a new request as doing so can stall the guest * if reprogramming repeatedly fails. */ if (pmc_reprogram_counter(pmc, PERF_TYPE_RAW, (eventsel & pmu->raw_event_mask), !(eventsel & ARCH_PERFMON_EVENTSEL_USR), !(eventsel & ARCH_PERFMON_EVENTSEL_OS), eventsel & ARCH_PERFMON_EVENTSEL_INT)) return; reprogram_complete: clear_bit(pmc->idx, (unsigned long *)&pmc_to_pmu(pmc)->reprogram_pmi); pmc->prev_counter = 0; > > @@ -324,16 +323,27 @@ void reprogram_counter(struct kvm_pmc *pmc) > > } > > if (pmc->current_config == new_config && pmc_resume_counter(pmc)) > > - return; > > + goto reprogram_complete; > > pmc_release_perf_event(pmc); > > pmc->current_config = new_config; > > - pmc_reprogram_counter(pmc, PERF_TYPE_RAW, > > - (eventsel & pmu->raw_event_mask), > > - !(eventsel & ARCH_PERFMON_EVENTSEL_USR), > > - !(eventsel & ARCH_PERFMON_EVENTSEL_OS), > > - eventsel & ARCH_PERFMON_EVENTSEL_INT); > > + > > + /* > > + * If reprogramming fails, e.g. due to contention, leave the counter's > > + * regprogram bit set, i.e. opportunistically try again on the next PMU > > This is what we need, in the upstream case we need to keep trying regprogram > to try to occupy the hardware. Maybe in an ideal world, but in reality KVM can't guarantee that programming will ever succeed. Making a new KVM_REQ_PMU will prevent entering the guest, i.e. will effectively hang the vCPU. Breaking the vPMU isn't great, but hanging the guest is worse. > > + * refresh. Don't make a new request as doing so can stall the guest > > + * if reprogramming repeatedly fails. > > This does not happen, the guest still enters w/p perf_event backend support > and the vPMU is broken until the next vm-exit. > > There is no need to endlessly call kvm_pmu_handle_event() when reprogram fails. Yes, that's what the above comment is calling out, or at least trying to call out.