> On 26 Nov 2019, at 0:44, Aaron Lewis <aaronlewis@xxxxxxxxxx> wrote: > > Verify that the difference between an L2 RDTSC instruction and the > IA32_TIME_STAMP_COUNTER MSR value stored in the VMCS12's VM-exit > MSR-store list is less than 750 cycles, 99.9% of the time. > > Signed-off-by: Aaron Lewis <aaronlewis@xxxxxxxxxx> > Reviewed-by: Jim Mattson <jmattson@xxxxxxxxxx> > --- > x86/unittests.cfg | 6 ++++ > x86/vmx_tests.c | 89 +++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 95 insertions(+) > > diff --git a/x86/unittests.cfg b/x86/unittests.cfg > index b4865ac..5291d96 100644 > --- a/x86/unittests.cfg > +++ b/x86/unittests.cfg > @@ -284,6 +284,12 @@ extra_params = -cpu host,+vmx -append vmx_vmcs_shadow_test > arch = x86_64 > groups = vmx > > +[vmx_rdtsc_vmexit_diff_test] > +file = vmx.flat > +extra_params = -cpu host,+vmx -append rdtsc_vmexit_diff_test > +arch = x86_64 > +groups = vmx > + > [debug] > file = debug.flat > arch = x86_64 > diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c > index 1d8932f..f42ae2c 100644 > --- a/x86/vmx_tests.c > +++ b/x86/vmx_tests.c > @@ -8790,7 +8790,94 @@ static void vmx_vmcs_shadow_test(void) > enter_guest(); > } > > +/* > + * This test monitors the difference between an L2 RDTSC instruction > + * and the IA32_TIME_STAMP_COUNTER MSR value stored in the VMCS12 > + * VM-exit MSR-store list when taking a VM-exit on the instruction > + * following RDTSC. > + */ > +#define RDTSC_DIFF_ITERS 100000 > +#define RDTSC_DIFF_FAILS 100 > +#define L1_RDTSC_LIMIT 750 General note: I personally dislike the use of terms L1 & L2 in kvm-unit-tests. I prefer to use host vs. guest OR vmx root mode vs. non-root mode. Especially considering that kvm-unit-tests have de-facto became cpu-unit-tests as it can run on top of any CPU implementation. Either vCPU on top of some hypervisor (KVM being one of them) or a BareMetal CPU (Like Nadav Amit runs to verify tests correctness :P). > + > +/* > + * Set 'use TSC offsetting' and set the L2 offset to the > + * inverse of L1's current TSC value, so that L2 starts running > + * with an effective TSC value of 0. > + */ > +static void reset_l2_tsc_to_zero(void) > +{ > + TEST_ASSERT_MSG(ctrl_cpu_rev[0].clr & CPU_USE_TSC_OFFSET, > + "Expected support for 'use TSC offsetting'"); > + > + vmcs_set_bits(CPU_EXEC_CTRL0, CPU_USE_TSC_OFFSET); > + vmcs_write(TSC_OFFSET, -rdtsc()); > +} > + > +static void rdtsc_vmexit_diff_test_guest(void) > +{ > + int i; > + > + for (i = 0; i < RDTSC_DIFF_ITERS; i++) > + asm volatile("rdtsc; vmcall" : : : "eax", "edx”); I would add a comment here on why you use inline asm inside of just { l2_rdtsc = rdtsc(); vmcall(); }. (Because of the extra cycles wasted on “ORing” RDX:RAX and saving result to some global before vmcall). > +} > + > +/* > + * This function only considers the "use TSC offsetting" VM-execution > + * control. It does not handle "use TSC scaling" (because the latter > + * isn't available to L1 today.) Because function correctness assume the latter, consider adding a runtime assert() on it? > + */ > +static unsigned long long l1_time_to_l2_time(unsigned long long t) > +{ > + if (vmcs_read(CPU_EXEC_CTRL0) & CPU_USE_TSC_OFFSET) > + t += vmcs_read(TSC_OFFSET); > + > + return t; > +} > + > +static unsigned long long get_tsc_diff(void) I think get_tsc_diff() is a bit of too generic name. May cause confusion. I would consider renaming to rdtsc_vmexit_diff_test_iteration() or just put logic inline test itself. > +{ > + unsigned long long l2_tsc, l1_to_l2_tsc; > + > + enter_guest(); > + skip_exit_vmcall(); > + l2_tsc = (u32) regs.rax + (regs.rdx << 32); > + l1_to_l2_tsc = l1_time_to_l2_time(exit_msr_store[0].value); > + > + return l1_to_l2_tsc - l2_tsc; > +} > + > +static void rdtsc_vmexit_diff_test(void) > +{ > + int fail = 0; > + int i; > + > + test_set_guest(rdtsc_vmexit_diff_test_guest); > + > + reset_l2_tsc_to_zero(); > > + /* > + * Set up the VMCS12 VM-exit MSR-store list to store just one > + * MSR: IA32_TIME_STAMP_COUNTER. Note that the value stored is > + * in the L1 time domain (i.e., it is not adjusted according > + * to the TSC multiplier and TSC offset fields in the VMCS12, > + * as an L2 RDTSC would be.) > + */ > + exit_msr_store = alloc_page(); > + exit_msr_store[0].index = MSR_IA32_TSC; > + vmcs_write(EXI_MSR_ST_CNT, 1); > + vmcs_write(EXIT_MSR_ST_ADDR, virt_to_phys(exit_msr_store)); > + > + for (i = 0; i < RDTSC_DIFF_ITERS; i++) { > + if (get_tsc_diff() < L1_RDTSC_LIMIT) Isn’t having a small diff between the value written to exit_msr_store[0].value to L2’s RDTSC result a good thing? i.e. We wish that the MSR value captured by host will be very close to the guest RDTSC value on guest->host VMExit. So shouldn’t the condition be (get_tsc_diff() >= L1_RDTSC_LIMIT)? > + fail++; > + } > + > + enter_guest(); > + > + report("RDTSC to VM-exit delta too high in %d of %d iterations", > + fail < RDTSC_DIFF_FAILS, fail, RDTSC_DIFF_ITERS); > +} > > static int invalid_msr_init(struct vmcs *vmcs) > { > @@ -9056,5 +9143,7 @@ struct vmx_test vmx_tests[] = { > /* Atomic MSR switch tests. */ > TEST(atomic_switch_max_msrs_test), > TEST(atomic_switch_overflow_msrs_test), > + /* Miscellaneous tests */ You can consider it de-facto part of “Atomic MSR switch tests.” and remove this comment. > + TEST(rdtsc_vmexit_diff_test), > { NULL, NULL, NULL, NULL, NULL, {0} }, > }; > -- > 2.24.0.432.g9d3f5f5b63-goog >