Re: [PATCH v4 11/12] KVM: x86/svm/pmu: Add AMD PerfMonV2 support

Sean Christopherson <seanjc@xxxxxxxxxx> · Fri, 7 Apr 2023 07:44:57 -0700

On Fri, Apr 07, 2023, Like Xu wrote:
> On 7/4/2023 9:35 am, Sean Christopherson wrote:
> > On Tue, Feb 14, 2023, Like Xu wrote:
> > > +	case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS:
> > > +		if (!msr_info->host_initiated)
> > > +			return 0; /* Writes are ignored */
> > 
> > Where is the "writes ignored" behavior documented?  I can't find anything in the
> > APM that defines write behavior.
> 
> KVM would follow the real hardware behavior once specifications stay silent
> on details or secret.

So is that a "this isn't actually documented anywhere" answer?  It's not your
responsibility to get AMD to document their CPUs, but I want to clearly document
when KVM's behavior is based solely off of observed hardware behavior, versus an
actual specification.

> How about this:
> 
> 	/*
> 	 * Note, AMD ignores writes to reserved bits and read-only PMU MSRs,
> 	 * whereas Intel generates #GP on attempts to write reserved/RO MSRs.
> 	 */

Looks good.

> > > +		pmu->nr_arch_gp_counters = min_t(unsigned int,
> > > +						 ebx.split.num_core_pmc,
> > > +						 kvm_pmu_cap.num_counters_gp);
> > > +	} else if (guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE)) {
> > >   		pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS_CORE;
> > 
> > This needs to be sanitized, no?  E.g. if KVM only has access to 4 counters, but
> > userspace sets X86_FEATURE_PERFCTR_CORE anyways.  Hrm, unless I'm missing something,
> > that's a pre-existing bug.
> 
> Now your point is that if a user space more capbility than KVM can support,
> KVM should constrain it.
> Your previous preference was that the user space can set capbilities that
> evene if KVM doesn't support as long as it doesn't break KVM and host and the
> guest will eat its own.

Letting userspace define a "bad" configuration is perfectly ok, but KVM needs to
be careful not to endanger itself by consuming the bad state.  A good example is
the handling of nested SVM features in svm_vcpu_after_set_cpuid().  KVM lets
userspace define anything and everything, but KVM only actually tries to utilize
a feature if the feature is actually supported in hardware.

In this case, it's not clear to me that putting a bogus value into "nr_arch_gp_counters"
is safe (for KVM).  And AIUI, the guest can't actually use more than
kvm_pmu_cap.num_counters_gp counters, i.e. KVM isn't arbitrarily restricting the
setup.