Re: [PATCH 2/9] KVM: Expose a version 2 architectural PMU to a guests

Avi Kivity <avi@xxxxxxxxxx> · Wed, 02 Nov 2011 12:01:51 +0200

On 11/01/2011 02:30 PM, Gleb Natapov wrote:
> > > +
> > > +/* mapping between fixed pmc index and arch_events array */
> > > +int fixed_pmc_events[] = {1, 0, 2};
> > > +
> > > +static bool pmc_is_gp(struct kvm_pmc *pmc)
> > > +{
> > > +	return pmc->type == KVM_PMC_GP;
> > > +}
> > > +
> > > +static inline u64 pmc_bitmask(struct kvm_pmc *pmc)
> > > +{
> > > +	struct kvm_pmu *pmu = &pmc->vcpu->arch.pmu;
> > > +
> > > +	return pmc_is_gp(pmc) ? pmu->gp_counter_bitmask :
> > > +		pmu->fixed_counter_bitmask;
> > > +}
> > 
> > Nicer to just push the bitmask (or bitwidth) into the counter itself.
> > 
> Hmm, is it really nicer to replicate the same information 35 times?

If it were 35 times, you could do pmu->type->bitmask.  But it's just 5
or 6 times.

> > > +
> > > +static void kvm_perf_overflow_intr(struct perf_event *perf_event,
> > > +		struct perf_sample_data *data, struct pt_regs *regs)
> > > +{
> > > +	struct kvm_pmc *pmc = perf_event->overflow_handler_context;
> > > +	struct kvm_pmu *pmu = &pmc->vcpu->arch.pmu;
> > > +	if (!__test_and_set_bit(pmc_to_global_idx(pmc),
> > > +				(unsigned long *)&pmu->reprogram_pmi)) {
> > > +		kvm_perf_overflow(perf_event, data, regs);
> > > +		kvm_make_request(KVM_REQ_PMU, pmc->vcpu);
> > > +	}
> > > +}
> > 
> > Is it safe to use the __ versions here?
> >
> It supposed to run in an NMI context on the same CPU that just ran
> the vcpu so simultaneous access to the same variable from different
> CPUs shouldn't be possible. But if your scenario below can happen then
> that assumption may not hold. The question is if PMI delivery can be
> so skewed as to be delivered long after vmexit (which switches perf msr
> values btw).

The compiler/runtime is allowed to implement __test_and_set_bit() as
multiple instructions, no? Do we have any similar sequences outside nmi
context?

> > Do we need to follow kvm_make_request() with kvm_vcpu_kick()?  If there
> > is a skew between the overflow and the host PMI, the guest might have
> > executed a HLT.
> Is kvm_vcpu_kick() safe for NMI context?

No.  There is irq_work_queue() for that.  Would be good to avoid it if
we know that it's safe to (for example if we have PF_VCPU set).

> > 
> > > +
> > > +static void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 en_pmi, int idx)
> > > +{
> > > +	unsigned en = en_pmi & 0x3;
> > > +	bool pmi = en_pmi & 0x8;
> > > +
> > > +	stop_counter(pmc);
> > > +
> > > +	if (!en || !pmc_enabled(pmc))
> > > +		return;
> > > +
> > > +	reprogram_counter(pmc, PERF_TYPE_HARDWARE,
> > > +			arch_events[fixed_pmc_events[idx]].event_type,
> > > +			!(en & 0x2), /* exclude user */
> > > +			!(en & 0x1), /* exclude kernel */
> > > +			pmi);
> > 
> > Are there no #defines for those constants?
> > 
> Nope. perf_event_intel.c open codes them too.

Okay.

> > 
> > The user can cause this to be very small (even zero).  Can this cause an
> > NMI storm?
> > 
> If user will set it to zero then attr.sample_period will always be 0 and
> perf will think that the event is non sampling and will use max_period
> instead. For a small value greater than zero how is it different from
> userspace creating an event with sample_period of 1?

I don't know.  Does the kernel survive it?

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html