Re: [PATCH v2 1/2] KVM: selftests: Provide generic way to read system counter

Marc Zyngier <maz@xxxxxxxxxx> · Fri, 17 Mar 2023 17:09:39 +0000

On Thu, 16 Mar 2023 22:27:51 +0000,
Colton Lewis <coltonlewis@xxxxxxxxxx> wrote:
> 
> Provide a generic function to read the system counter from the guest
> for timing purposes. A common and important way to measure guest
> performance is to measure the amount of time different actions take in
> the guest. Provide also a mathematical conversion from cycles to
> nanoseconds and a macro for timing individual statements.
> 
> Substitute the previous custom implementation of a similar function in
> system_counter_offset_test with this new implementation.
> 
> Signed-off-by: Colton Lewis <coltonlewis@xxxxxxxxxx>
> ---
>  .../testing/selftests/kvm/include/kvm_util.h  | 15 ++++++++++
>  tools/testing/selftests/kvm/lib/kvm_util.c    | 30 +++++++++++++++++++
>  .../kvm/system_counter_offset_test.c          | 10 ++-----
>  3 files changed, 47 insertions(+), 8 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h
> index c9286811a4cb..8b478eabee4c 100644
> --- a/tools/testing/selftests/kvm/include/kvm_util.h
> +++ b/tools/testing/selftests/kvm/include/kvm_util.h
> @@ -10,4 +10,19 @@
>  #include "kvm_util_base.h"
>  #include "ucall_common.h"
> 
> +#if defined(__aarch64__) || defined(__x86_64__)
> +
> +uint64_t cycles_read(void);
> +double cycles_to_ns(struct kvm_vcpu *vcpu, double cycles);
> +
> +#define MEASURE_CYCLES(x)			\
> +	({					\
> +		uint64_t start;			\
> +		start = cycles_read();		\
> +		x;				\

You insert memory accesses inside a sequence that has no dependency
with it. On a weakly ordered memory system, there is absolutely no
reason why the memory access shouldn't be moved around. What do you
exactly measure in that case?

> +		cycles_read() - start;		\

I also question the usefulness of this exercise. You're comparing the
time it takes for a multi-GHz system to put a write in a store buffer
(assuming it didn't miss in the TLBs) vs a counter that gets updated
at a frequency of a few tens of MHz.

My guts feeling is that this results in a big fat zero most of the
time, but I'm happy to be explained otherwise.

> +	})
> +
> +#endif
> +
>  #endif /* SELFTEST_KVM_UTIL_H */
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index 3ea24a5f4c43..780481a92efe 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -2135,3 +2135,34 @@ void __attribute((constructor)) kvm_selftest_init(void)
> 
>  	kvm_selftest_arch_init();
>  }
> +
> +#if defined(__aarch64__)
> +
> +#include "arch_timer.h"
> +
> +uint64_t cycles_read(void)
> +{
> +	return timer_get_cntct(VIRTUAL);
> +}
> +
> +double cycles_to_ns(struct kvm_vcpu *vcpu, double cycles)
> +{
> +	return cycles * (1e9 / timer_get_cntfrq());

We already have all the required code to deal with ns conversions
using a multiplier and a shift, avoiding floating point like the
plague it is. Please reuse the kernel code for this, as you're quite
likely to only measure the time it takes for KVM to trap the FP
registers and perform a FP/SIMD switch...

	M.

-- 
Without deviation from the norm, progress is not possible.