Hey Sean,
Thanks for the review.
Sean Christopherson <seanjc@xxxxxxxxxx> writes:
On Wed, Sep 18, 2024, Colton Lewis wrote:
Branch in main() depending on if the CPU is Intel or AMD. They are
subject to vastly different requirements because the AMD PMU lacks
many properties defined by the Intel PMU including the entire CPUID
0xa function where Intel stores all the PMU properties. AMD lacks this
as well as any consistent notion of PMU versions as Intel does. Every
feature is a separate flag and they aren't the same features as Intel.
Set up a VM for testing core AMD counters and ensure proper CPUID
features are set.
Signed-off-by: Colton Lewis <coltonlewis@xxxxxxxxxx>
---
.../selftests/kvm/x86_64/pmu_counters_test.c | 104 ++++++++++++++----
1 file changed, 83 insertions(+), 21 deletions(-)
diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
index 0e305e43a93b..5b240585edc5 100644
--- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
+++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
@@ -30,10 +30,21 @@
#define NUM_INSNS_RETIRED (NUM_LOOPS * NUM_INSNS_PER_LOOP +
NUM_EXTRA_INSNS)
+/*
+ * Limit testing to MSRs that are actually defined by Intel (in the
SDM). MSRs
+ * that aren't defined counter MSRs *probably* don't exist, but there's
no
+ * guarantee that currently undefined MSR indices won't be used for
something
+ * other than PMCs in the future.
+ */
+#define MAX_NR_GP_COUNTERS 8
+#define MAX_NR_FIXED_COUNTERS 3
+#define AMD_NR_CORE_COUNTERS 4
+#define AMD_NR_CORE_EXT_COUNTERS 6
+
static uint8_t kvm_pmu_version;
static bool kvm_has_perf_caps;
-static struct kvm_vm *pmu_vm_create_with_one_vcpu(struct kvm_vcpu
**vcpu,
+static struct kvm_vm *intel_pmu_vm_create(struct kvm_vcpu **vcpu,
void *guest_code,
When renaming things, please fixup the alignment as needed. Yes, it's
more
churn, but one-time pain is preferable to living indefinitely with funky
formatting.
Understood
I also don't like renaming just one symbol. E.g. the above
MAX_NR_GP_COUNTERS
and MAX_NR_FIXED_COUNTERS #defines are Intel specific, but that's not at
all
clear from the code. Ditto for guest_rd_wr_counters() vs.
guest_test_rdwr_core_counters().
Given how little code is actually shared between Intel and AMD, I think
it makes
sense to have the bulk of the code live in separate .c files. Since
tools/testing/selftests/kvm/lib/x86/pmu.c is already a thing, the best
option is
probably to rename pmu_counters_test.c to intel_pmu_counters_test.c, and
then
extract the common bits to lib/x86/pmu.c (or include/x86/pmu.h as
appropriate).
Yeah I see what you mean about making processor-specific functions more
obvious and think separating to different files would help a lot with
that.
uint8_t pmu_version,
uint64_t perf_capabilities)
+static void test_core_counters(void)
+{
+ uint8_t nr_counters = nr_core_counters();
+ bool core_ext = kvm_cpu_has(X86_FEATURE_PERF_CTR_EXT_CORE);
+ bool perfmon_v2 = kvm_cpu_has(X86_FEATURE_PERFMON_V2);
+ struct kvm_vcpu *vcpu;
+ struct kvm_vm *vm;
+
+ for (uint8_t ce = 0; ce <= core_ext; ce++) {
Kernel style is to not declared variables inside for-loops.
I ran it through checkpatch and it didn't complain.
+ for (uint8_t pm = 0; pm <= perfmon_v2; pm++) {
Iterating over booleans is decidedly odd, the indentation levels are
painful and
will only get worse as more features are added, and the "ce" and "pm"
variables
aren't all that intuitive. More below.
+ for (uint8_t nc = 0; nc <= nr_counters; nc++) {
I also find "nc" to be unintuitive. Either use a fully descriptive name,
or
make it obvious that the variables is an iterator. E.g. either
uint8_t max_nr_counters = nr_core_counters();
...
for (nr_counters = 0; nr_counters < max_nr_counters; nr_counters++) {
or
for (j = 0; j < nr_counters; j++) {
'j' is obviously not descriptive, but when reading the usage, it's more
obvious
that it's a loop iterator (if you choose
Ok I'll go with the generic loop iterator name.
+ vm = vm_create_with_one_vcpu(&vcpu, guest_test_core_counters);
+
+ if (nc)
Is '0' not a legal number of counters?
AMD64 Architecture Programmers Manual Volume 2 Chapter 13.2.2 states
that only nonzero values for that property are meaningful.
With 0, you are supposed to assume either 4 or 6 counters depending on
the CoreExt feature.
I could make 0 mean 0 counters inside the guest but that would break
what hardware does.
+ vcpu_set_cpuid_property(
Google3! (Never, ever wrap immediately after the opening paranethesis).
Checkpatch didn't complain.
+ vcpu, X86_PROPERTY_NUM_PERF_CTR_CORE, nc);
+ if (ce)
+ vcpu_set_cpuid_feature(
+ vcpu, X86_FEATURE_PERF_CTR_EXT_CORE);
This likely doesn't do what you want. By default, vm_arch_vcpu_add()
initializes
CPUID to KVM's supported CPUID. So only _setting_ the feature means that
that
the test is likely only ever running with the full set of supported
features.
Ok, so I have to explicitly unset the feature if I don't want it.
Jumping back to my complaints with the for-loops, if the features to
interate on
are collected in an array, then the test can generate a mask of all
possible
combinations and iterate over that (plus the array itself). That keeps
the
indentation bounded and eliminates the copy+paste needed to add a new
feature.
The only downside is that the test is limited to 64 features, but we'll
run into
run time issues long before that limit is reached.
const struct kvm_x86_cpu_feature pmu_features[] = {
X86_FEATURE_PERF_CTR_EXT_CORE,
X86_FEATURE_PERFMON_V2,
};
const u64 pmu_features_mask = BIT_ULL(ARRAY_SIZE(pmu_features)) - 1;
for (mask = 0; mask <= pmu_features_mask; mask++) {
for (nr_counters = 0; nr_counters < max_nr_counters; nr_counters++) {
vm = vm_create_with_one_vcpu(&vcpu, guest_test_core_counters);
vcpu_set_cpuid_property(vcpu, X86_PROPERTY_NUM_PERF_CTR_CORE,
nr_counters);
/* Comment goes here */
for (i = 0; i < ARRAY_SIZE(pmu_features); i++)
vcpu_set_or_clear_cpuid_feature(vcpu, pmu_features[i],
mask & BIT_ULL(i));
...
}
I thought of putting the features in a bitmask but worried it obscured
the intent too much. Listing features in an array and constructing the
bitmask seems clear enough to me so I will use that technique now.
+ if (pm)
+ vcpu_set_cpuid_feature(
+ vcpu, X86_FEATURE_PERFMON_V2);
+
+ pr_info("Testing core counters: CoreExt = %u, PerfMonV2 = %u,
NumCounters = %u\n",
+ ce, pm, nc);
+ run_vcpu(vcpu);
+
+ kvm_vm_free(vm);
+ }
+ }
+ }
+}