Re: [PATCH 2/4] KVM: x86/pmu: Clear "reprogram" bit if counter is disabled or disallowed

Sean Christopherson <seanjc@xxxxxxxxxx> · Fri, 14 Oct 2022 16:26:44 +0000

On Fri, Oct 14, 2022, Like Xu wrote:
> For subject title, the "reprogram" bit is _only_ used to keep track of
> pmc->perf_event,
> not whether the counter is disabled.
> 
> On 23/9/2022 8:13 am, Sean Christopherson wrote:
> > When reprogramming a counter, clear the counter's "reprogram pending" bit
> > if the counter is disabled (by the guest) or is disallowed (by the
> > userspace filter).  In both cases, there's no need to re-attempt
> > programming on the next coincident KVM_REQ_PMU as enabling the counter by
> > either method will trigger reprogramming.
> 
> Perhaps we could move the check_pmu_event_filter() towards the top of the
> call stack.

Top of what call stack exactly?  reprogram_counter() has multiple callers, and
the filter check is already near the top of reprogram_counter().

> > @@ -245,7 +245,6 @@ static bool pmc_resume_counter(struct kvm_pmc *pmc)
> >   	perf_event_enable(pmc->perf_event);
> >   	pmc->is_paused = false;
> > -	clear_bit(pmc->idx, (unsigned long *)&pmc_to_pmu(pmc)->reprogram_pmi);
> 
> This change is very suspicious.

In the current code, pmc_resume_counter() clears the bit iff it returns true.
With this patch, reprogram_counter() is guarnteed to clear the bit if
pmc_resume_counter() returns true.

	if (pmc->current_config == new_config && pmc_resume_counter(pmc))
		goto reprogram_complete;

	pmc_release_perf_event(pmc);

	pmc->current_config = new_config;

	/*
	 * If reprogramming fails, e.g. due to contention, leave the counter's
	 * regprogram bit set, i.e. opportunistically try again on the next PMU
	 * refresh.  Don't make a new request as doing so can stall the guest
	 * if reprogramming repeatedly fails.
	 */
	if (pmc_reprogram_counter(pmc, PERF_TYPE_RAW,
				  (eventsel & pmu->raw_event_mask),
				  !(eventsel & ARCH_PERFMON_EVENTSEL_USR),
				  !(eventsel & ARCH_PERFMON_EVENTSEL_OS),
				  eventsel & ARCH_PERFMON_EVENTSEL_INT))
		return;

reprogram_complete:
	clear_bit(pmc->idx, (unsigned long *)&pmc_to_pmu(pmc)->reprogram_pmi);
	pmc->prev_counter = 0;

> > @@ -324,16 +323,27 @@ void reprogram_counter(struct kvm_pmc *pmc)
> >   	}
> >   	if (pmc->current_config == new_config && pmc_resume_counter(pmc))
> > -		return;
> > +		goto reprogram_complete;
> >   	pmc_release_perf_event(pmc);
> >   	pmc->current_config = new_config;
> > -	pmc_reprogram_counter(pmc, PERF_TYPE_RAW,
> > -			      (eventsel & pmu->raw_event_mask),
> > -			      !(eventsel & ARCH_PERFMON_EVENTSEL_USR),
> > -			      !(eventsel & ARCH_PERFMON_EVENTSEL_OS),
> > -			      eventsel & ARCH_PERFMON_EVENTSEL_INT);
> > +
> > +	/*
> > +	 * If reprogramming fails, e.g. due to contention, leave the counter's
> > +	 * regprogram bit set, i.e. opportunistically try again on the next PMU
> 
> This is what we need, in the upstream case we need to keep trying regprogram
> to try to occupy the hardware.

Maybe in an ideal world, but in reality KVM can't guarantee that programming will
ever succeed.  Making a new KVM_REQ_PMU will prevent entering the guest, i.e. will
effectively hang the vCPU.  Breaking the vPMU isn't great, but hanging the guest
is worse.

> > +	 * refresh.  Don't make a new request as doing so can stall the guest
> > +	 * if reprogramming repeatedly fails.
> 
> This does not happen, the guest still enters w/p perf_event backend support
> and the vPMU is broken until the next vm-exit.
> 
> There is no need to endlessly call kvm_pmu_handle_event() when reprogram fails.

Yes, that's what the above comment is calling out, or at least trying to call out.