Re: [PATCH v3 2/2] KVM: selftests: Print summary stats of memory latency distribution

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Oliver Upton <oliver.upton@xxxxxxxxx> writes:

On Wed, May 31, 2023 at 02:01:52PM -0700, Sean Christopherson wrote:
On Mon, Mar 27, 2023, Colton Lewis wrote:
> diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
> index f65e491763e0..d441f485e9c6 100644
> --- a/tools/testing/selftests/kvm/include/aarch64/processor.h
> +++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
> @@ -219,4 +219,14 @@ uint32_t guest_get_vcpuid(void);
>  uint64_t cycles_read(void);
>  uint64_t cycles_to_ns(struct kvm_vcpu *vcpu, uint64_t cycles);
>
> +#define MEASURE_CYCLES(x)			\
> +	({					\
> +		uint64_t start;			\
> +		start = cycles_read();		\
> +		isb();				\

Would it make sense to put the necessary barriers inside the cycles_read() (or
whatever we end up calling it)?  Or does that not make sense on ARM?

+1. Additionally, the function should have a name that implies ordering,
like read_system_counter_ordered() or similar.

cycles_read() is currently a wrapper for timer_get_cntct(), which has
an isb() at the beginning but not the end. I think it would make more
sense to add the barrier there if there is no objection.

> +		x;				\
> +		dsb(nsh);			\

I assume you're doing this because you want to wait for outstanding
loads and stores to complete due to 'x', right?

Correct.

My knee-jerk reaction was that you could just do an mb() and share the
implementation between arches, but it would seem the tools/ flavor of
the barrier demotes to a DMB because... reasons.

Yep, and what I read from the ARM manual says it has to be DSB.

> +		cycles_read() - start;		\
> +	})
> +
>  #endif /* SELFTEST_KVM_PROCESSOR_H */
...

> diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
> index 5d977f95d5f5..7352e02db4ee 100644
> --- a/tools/testing/selftests/kvm/include/x86_64/processor.h
> +++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
> @@ -1137,4 +1137,14 @@ void virt_map_level(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
>  uint64_t cycles_read(void);
>  uint64_t cycles_to_ns(struct kvm_vcpu *vcpu, uint64_t cycles);
>
> +#define MEASURE_CYCLES(x)			\
> +	({					\
> +		uint64_t start;			\
> +		start = cycles_read();		\
> +		asm volatile("mfence");		\

This is incorrect as placing the barrier after the RDTSC allows the RDTSC to be executed before earlier loads, e.g. could measure memory accesses from whatever was before MEASURE_CYCLES(). And per the kernel's rdtsc_ordered(), it sounds like RDTSC can only be hoisted before prior loads, i.e. will be ordered with respect
to future loads and stores.

Interesting, so I will swap the fence and the cycles_read()


Same thing goes for the arm64 variant of the function... You want to
insert an isb() immediately _before_ you read the counter register to
avoid speculation.

That's taken care of. See my earlier comment about timer_get_cntct()



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux