On Fri, Apr 07, 2023, Like Xu wrote: > On 7/4/2023 9:35 am, Sean Christopherson wrote: > > On Tue, Feb 14, 2023, Like Xu wrote: > > > + case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS: > > > + if (!msr_info->host_initiated) > > > + return 0; /* Writes are ignored */ > > > > Where is the "writes ignored" behavior documented? I can't find anything in the > > APM that defines write behavior. > > KVM would follow the real hardware behavior once specifications stay silent > on details or secret. So is that a "this isn't actually documented anywhere" answer? It's not your responsibility to get AMD to document their CPUs, but I want to clearly document when KVM's behavior is based solely off of observed hardware behavior, versus an actual specification. > How about this: > > /* > * Note, AMD ignores writes to reserved bits and read-only PMU MSRs, > * whereas Intel generates #GP on attempts to write reserved/RO MSRs. > */ Looks good. > > > + pmu->nr_arch_gp_counters = min_t(unsigned int, > > > + ebx.split.num_core_pmc, > > > + kvm_pmu_cap.num_counters_gp); > > > + } else if (guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE)) { > > > pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS_CORE; > > > > This needs to be sanitized, no? E.g. if KVM only has access to 4 counters, but > > userspace sets X86_FEATURE_PERFCTR_CORE anyways. Hrm, unless I'm missing something, > > that's a pre-existing bug. > > Now your point is that if a user space more capbility than KVM can support, > KVM should constrain it. > Your previous preference was that the user space can set capbilities that > evene if KVM doesn't support as long as it doesn't break KVM and host and the > guest will eat its own. Letting userspace define a "bad" configuration is perfectly ok, but KVM needs to be careful not to endanger itself by consuming the bad state. A good example is the handling of nested SVM features in svm_vcpu_after_set_cpuid(). KVM lets userspace define anything and everything, but KVM only actually tries to utilize a feature if the feature is actually supported in hardware. In this case, it's not clear to me that putting a bogus value into "nr_arch_gp_counters" is safe (for KVM). And AIUI, the guest can't actually use more than kvm_pmu_cap.num_counters_gp counters, i.e. KVM isn't arbitrarily restricting the setup.