On Thu, 16 Mar 2023 22:27:51 +0000, Colton Lewis <coltonlewis@xxxxxxxxxx> wrote: > > Provide a generic function to read the system counter from the guest > for timing purposes. A common and important way to measure guest > performance is to measure the amount of time different actions take in > the guest. Provide also a mathematical conversion from cycles to > nanoseconds and a macro for timing individual statements. > > Substitute the previous custom implementation of a similar function in > system_counter_offset_test with this new implementation. > > Signed-off-by: Colton Lewis <coltonlewis@xxxxxxxxxx> > --- > .../testing/selftests/kvm/include/kvm_util.h | 15 ++++++++++ > tools/testing/selftests/kvm/lib/kvm_util.c | 30 +++++++++++++++++++ > .../kvm/system_counter_offset_test.c | 10 ++----- > 3 files changed, 47 insertions(+), 8 deletions(-) > > diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h > index c9286811a4cb..8b478eabee4c 100644 > --- a/tools/testing/selftests/kvm/include/kvm_util.h > +++ b/tools/testing/selftests/kvm/include/kvm_util.h > @@ -10,4 +10,19 @@ > #include "kvm_util_base.h" > #include "ucall_common.h" > > +#if defined(__aarch64__) || defined(__x86_64__) > + > +uint64_t cycles_read(void); > +double cycles_to_ns(struct kvm_vcpu *vcpu, double cycles); > + > +#define MEASURE_CYCLES(x) \ > + ({ \ > + uint64_t start; \ > + start = cycles_read(); \ > + x; \ You insert memory accesses inside a sequence that has no dependency with it. On a weakly ordered memory system, there is absolutely no reason why the memory access shouldn't be moved around. What do you exactly measure in that case? > + cycles_read() - start; \ I also question the usefulness of this exercise. You're comparing the time it takes for a multi-GHz system to put a write in a store buffer (assuming it didn't miss in the TLBs) vs a counter that gets updated at a frequency of a few tens of MHz. My guts feeling is that this results in a big fat zero most of the time, but I'm happy to be explained otherwise. > + }) > + > +#endif > + > #endif /* SELFTEST_KVM_UTIL_H */ > diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c > index 3ea24a5f4c43..780481a92efe 100644 > --- a/tools/testing/selftests/kvm/lib/kvm_util.c > +++ b/tools/testing/selftests/kvm/lib/kvm_util.c > @@ -2135,3 +2135,34 @@ void __attribute((constructor)) kvm_selftest_init(void) > > kvm_selftest_arch_init(); > } > + > +#if defined(__aarch64__) > + > +#include "arch_timer.h" > + > +uint64_t cycles_read(void) > +{ > + return timer_get_cntct(VIRTUAL); > +} > + > +double cycles_to_ns(struct kvm_vcpu *vcpu, double cycles) > +{ > + return cycles * (1e9 / timer_get_cntfrq()); We already have all the required code to deal with ns conversions using a multiplier and a shift, avoiding floating point like the plague it is. Please reuse the kernel code for this, as you're quite likely to only measure the time it takes for KVM to trap the FP registers and perform a FP/SIMD switch... M. -- Without deviation from the norm, progress is not possible.