Re: [PATCH v3 2/2] KVM: selftests: Print summary stats of memory latency distribution

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 31, 2023 at 02:01:52PM -0700, Sean Christopherson wrote:
> On Mon, Mar 27, 2023, Colton Lewis wrote:
> > diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
> > index f65e491763e0..d441f485e9c6 100644
> > --- a/tools/testing/selftests/kvm/include/aarch64/processor.h
> > +++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
> > @@ -219,4 +219,14 @@ uint32_t guest_get_vcpuid(void);
> >  uint64_t cycles_read(void);
> >  uint64_t cycles_to_ns(struct kvm_vcpu *vcpu, uint64_t cycles);
> > 
> > +#define MEASURE_CYCLES(x)			\
> > +	({					\
> > +		uint64_t start;			\
> > +		start = cycles_read();		\
> > +		isb();				\
> 
> Would it make sense to put the necessary barriers inside the cycles_read() (or
> whatever we end up calling it)?  Or does that not make sense on ARM?

+1. Additionally, the function should have a name that implies ordering,
like read_system_counter_ordered() or similar.

> > +		x;				\
> > +		dsb(nsh);			\

I assume you're doing this because you want to wait for outstanding
loads and stores to complete due to 'x', right?

My knee-jerk reaction was that you could just do an mb() and share the
implementation between arches, but it would seem the tools/ flavor of
the barrier demotes to a DMB because... reasons.

> > +		cycles_read() - start;		\
> > +	})
> > +
> >  #endif /* SELFTEST_KVM_PROCESSOR_H */
> ...
> 
> > diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
> > index 5d977f95d5f5..7352e02db4ee 100644
> > --- a/tools/testing/selftests/kvm/include/x86_64/processor.h
> > +++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
> > @@ -1137,4 +1137,14 @@ void virt_map_level(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
> >  uint64_t cycles_read(void);
> >  uint64_t cycles_to_ns(struct kvm_vcpu *vcpu, uint64_t cycles);
> > 
> > +#define MEASURE_CYCLES(x)			\
> > +	({					\
> > +		uint64_t start;			\
> > +		start = cycles_read();		\
> > +		asm volatile("mfence");		\
> 
> This is incorrect as placing the barrier after the RDTSC allows the RDTSC to be
> executed before earlier loads, e.g. could measure memory accesses from whatever
> was before MEASURE_CYCLES().  And per the kernel's rdtsc_ordered(), it sounds like
> RDTSC can only be hoisted before prior loads, i.e. will be ordered with respect
> to future loads and stores.

Same thing goes for the arm64 variant of the function... You want to
insert an isb() immediately _before_ you read the counter register to
avoid speculation.

arch_timer_read_cntvct_el0() back over in the kernel is a good example of
this. You can very likely ignore the ECV alternative for now.

-- 
Thanks,
Oliver



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux