Re: [PATCH v2 1/2] KVM: selftests: Provide generic way to read system counter

Colton Lewis <coltonlewis@xxxxxxxxxx> · Tue, 21 Mar 2023 19:10:04 +0000

Marc Zyngier <maz@xxxxxxxxxx> writes:

+#define MEASURE_CYCLES(x)			\
+	({					\
+		uint64_t start;			\
+		start = cycles_read();		\
+		x;				\

You insert memory accesses inside a sequence that has no dependency
with it. On a weakly ordered memory system, there is absolutely no
reason why the memory access shouldn't be moved around. What do you
exactly measure in that case?

cycles_read is built on another function timer_get_cntct which includes
its own barriers. Stripped of some abstraction, the sequence is:

timer_get_cntct (isb+read timer)
whatever is being measured
timer_get_cntct

I hadn't looked at it too closely before but on review of the manual
I think you are correct. Borrowing from example D7-2 in the manual, it
should be:

timer_get_cntct
isb
whatever is being measured
dsb
timer_get_cntct

+		cycles_read() - start;		\

I also question the usefulness of this exercise. You're comparing the
time it takes for a multi-GHz system to put a write in a store buffer
(assuming it didn't miss in the TLBs) vs a counter that gets updated
at a frequency of a few tens of MHz.

My guts feeling is that this results in a big fat zero most of the
time, but I'm happy to be explained otherwise.

In context, I'm trying to measure the time it takes to write to a buffer
*with dirty memory logging enabled*. What do you mean by zero? I can
confirm from running this code I am not measuring zero time.

We already have all the required code to deal with ns conversions
using a multiplier and a shift, avoiding floating point like the
plague it is. Please reuse the kernel code for this, as you're quite
likely to only measure the time it takes for KVM to trap the FP
registers and perform a FP/SIMD switch...

Will do.