Re: [kvm-unit-tests PATCH v3] x86: Add RDTSC test

Jim Mattson <jmattson@xxxxxxxxxx> · Tue, 28 Jan 2020 11:34:40 -0800

On Tue, Jan 28, 2020 at 11:03 AM Nadav Amit <nadav.amit@xxxxxxxxx> wrote:
>
> > On Jan 28, 2020, at 10:43 AM, Jim Mattson <jmattson@xxxxxxxxxx> wrote:
> >
> > On Tue, Jan 28, 2020 at 10:42 AM Nadav Amit <nadav.amit@xxxxxxxxx> wrote:
> >>> On Jan 28, 2020, at 10:33 AM, Sean Christopherson <sean.j.christopherson@xxxxxxxxx> wrote:
> >>>
> >>> On Tue, Jan 28, 2020 at 09:59:45AM -0800, Jim Mattson wrote:
> >>>> On Mon, Jan 27, 2020 at 12:56 PM Sean Christopherson
> >>>> <sean.j.christopherson@xxxxxxxxx> wrote:
> >>>>> On Mon, Jan 27, 2020 at 11:24:31AM -0800, Jim Mattson wrote:
> >>>>>> On Sun, Jan 26, 2020 at 8:36 PM Nadav Amit <nadav.amit@xxxxxxxxx> wrote:
> >>>>>>>> On Jan 26, 2020, at 2:06 PM, Jim Mattson <jmattson@xxxxxxxxxx> wrote:
> >>>>>>>>
> >>>>>>>> If I had to guess, you probably have SMM malware on your host. Remove
> >>>>>>>> the malware, and the test should pass.
> >>>>>>>
> >>>>>>> Well, malware will always be an option, but I doubt this is the case.
> >>>>>>
> >>>>>> Was my innuendo too subtle? I consider any code executing in SMM to be malware.
> >>>>>
> >>>>> SMI complications seem unlikely.  The straw that broke the camel's back
> >>>>> was a 1152 cyle delta, presumably the other failing runs had similar deltas.
> >>>>> I've never benchmarked SMI+RSM, but I highly doubt it comes anywhere close
> >>>>> to VM-Enter/VM-Exit's super optimized ~400 cycle round trip.  E.g. I
> >>>>> wouldn't be surprised if just SMI+RSM is over 1500 cycles.
> >>>>
> >>>> Good point. What generation of hardware are you running on, Nadav?
> >>>
> >>> Skylake.
> >>
> >> Indeed. Thanks for answering on my behalf ;-)
> >>
> >>>>>>> Interestingly, in the last few times the failure did not reproduce. Yet,
> >>>>>>> thinking about it made me concerned about MTRRs configuration, and that
> >>>>>>> perhaps performance is affected by memory marked as UC after boot, since
> >>>>>>> kvm-unit-test does not reset MTRRs.
> >>>>>>>
> >>>>>>> Reading the variable range MTRRs, I do see some ranges marked as UC (most of
> >>>>>>> the range 2GB-4GB, if I read the MTRRs correctly):
> >>>>>>>
> >>>>>>> MSR 0x200 = 0x80000000
> >>>>>>> MSR 0x201 = 0x3fff80000800
> >>>>>>> MSR 0x202 = 0xff000005
> >>>>>>> MSR 0x203 = 0x3fffff000800
> >>>>>>> MSR 0x204 = 0x38000000000
> >>>>>>> MSR 0x205 = 0x3f8000000800
> >>>>>>>
> >>>>>>> Do you think we should set the MTRRs somehow in KVM-unit-tests? If yes, can
> >>>>>>> you suggest a reasonable configuration?
> >>>>>>
> >>>>>> I would expect MTRR issues to result in repeatable failures. For
> >>>>>> instance, if your VMCS ended up in UC memory, that might slow things
> >>>>>> down quite a bit. But, I would expect the VMCS to end up at the same
> >>>>>> address each time the test is run.
> >>>>>
> >>>>> Agreed on the repeatable failures part, but putting the VMCS in UC memory
> >>>>> shouldn't affect this type of test.  The CPU's internal VMCS cache isn't
> >>>>> coherent, and IIRC isn't disabled if the MTRRs for the VMCS happen to be
> >>>>> UC.
> >>>>
> >>>> But the internal VMCS cache only contains selected fields, doesn't it?
> >>>> Uncached fields would have to be written to memory on VM-exit. Or are
> >>>> all of the mutable fields in the internal VMCS cache?
> >>>
> >>> Hmm.  I can neither confirm nor deny?  The official Intel response to this
> >>> would be "it's microarchitectural".  I'll put it this way: it's in Intel's
> >>> best interest to minimize the latency of VMREAD, VMWRITE, VM-Enter and
> >>> VM-Exit.
> >>
> >> I will run some more experiments and get back to you. It is a shame that
> >> every experiment requires a (real) boot…
> >
> > Yes! It's not just a shame; it's a serious usability issue.
>
> The easy way to run these experiments would have been to use an Intel CRB
> (Customer Reference Board), which boots relatively fast, with an ITP
> (In-Target Probe). This would have simplified testing and debugging
> considerably. Perhaps some sort of PXE-boot would also be beneficial.
> Unfortunately, I do not have the hardware and it does not seem other care
> that much so far.

Not true. I think others do care (I know I do). It's just that bare
metal testing is too hard right now. We need a way to "fire and
forget" the entire test suite and then check a log for failures once
it's done. I'm not suggesting that you do that; I'm just suggesting
that lowering the barrier to bare-metal testing would increase the
likelihood of people doing it.

> Despite the usability issues, running the tests on bare-metal already
> revealed several bugs in KVM (and one SDM issue), which were not apparent
> since the tests were wrong.

I'm not surprised. It's ludicrous that the test results are not
verified on hardware.