Re: [kvm-unit-tests PATCH v3] x86: Add RDTSC test

Jim Mattson <jmattson@xxxxxxxxxx> · Tue, 28 Jan 2020 10:43:36 -0800

On Tue, Jan 28, 2020 at 10:42 AM Nadav Amit <nadav.amit@xxxxxxxxx> wrote:
>
> > On Jan 28, 2020, at 10:33 AM, Sean Christopherson <sean.j.christopherson@xxxxxxxxx> wrote:
> >
> > On Tue, Jan 28, 2020 at 09:59:45AM -0800, Jim Mattson wrote:
> >> On Mon, Jan 27, 2020 at 12:56 PM Sean Christopherson
> >> <sean.j.christopherson@xxxxxxxxx> wrote:
> >>> On Mon, Jan 27, 2020 at 11:24:31AM -0800, Jim Mattson wrote:
> >>>> On Sun, Jan 26, 2020 at 8:36 PM Nadav Amit <nadav.amit@xxxxxxxxx> wrote:
> >>>>>> On Jan 26, 2020, at 2:06 PM, Jim Mattson <jmattson@xxxxxxxxxx> wrote:
> >>>>>>
> >>>>>> If I had to guess, you probably have SMM malware on your host. Remove
> >>>>>> the malware, and the test should pass.
> >>>>>
> >>>>> Well, malware will always be an option, but I doubt this is the case.
> >>>>
> >>>> Was my innuendo too subtle? I consider any code executing in SMM to be malware.
> >>>
> >>> SMI complications seem unlikely.  The straw that broke the camel's back
> >>> was a 1152 cyle delta, presumably the other failing runs had similar deltas.
> >>> I've never benchmarked SMI+RSM, but I highly doubt it comes anywhere close
> >>> to VM-Enter/VM-Exit's super optimized ~400 cycle round trip.  E.g. I
> >>> wouldn't be surprised if just SMI+RSM is over 1500 cycles.
> >>
> >> Good point. What generation of hardware are you running on, Nadav?
> >
> > Skylake.
>
> Indeed. Thanks for answering on my behalf ;-)
>
> >
> >>>>> Interestingly, in the last few times the failure did not reproduce. Yet,
> >>>>> thinking about it made me concerned about MTRRs configuration, and that
> >>>>> perhaps performance is affected by memory marked as UC after boot, since
> >>>>> kvm-unit-test does not reset MTRRs.
> >>>>>
> >>>>> Reading the variable range MTRRs, I do see some ranges marked as UC (most of
> >>>>> the range 2GB-4GB, if I read the MTRRs correctly):
> >>>>>
> >>>>>  MSR 0x200 = 0x80000000
> >>>>>  MSR 0x201 = 0x3fff80000800
> >>>>>  MSR 0x202 = 0xff000005
> >>>>>  MSR 0x203 = 0x3fffff000800
> >>>>>  MSR 0x204 = 0x38000000000
> >>>>>  MSR 0x205 = 0x3f8000000800
> >>>>>
> >>>>> Do you think we should set the MTRRs somehow in KVM-unit-tests? If yes, can
> >>>>> you suggest a reasonable configuration?
> >>>>
> >>>> I would expect MTRR issues to result in repeatable failures. For
> >>>> instance, if your VMCS ended up in UC memory, that might slow things
> >>>> down quite a bit. But, I would expect the VMCS to end up at the same
> >>>> address each time the test is run.
> >>>
> >>> Agreed on the repeatable failures part, but putting the VMCS in UC memory
> >>> shouldn't affect this type of test.  The CPU's internal VMCS cache isn't
> >>> coherent, and IIRC isn't disabled if the MTRRs for the VMCS happen to be
> >>> UC.
> >>
> >> But the internal VMCS cache only contains selected fields, doesn't it?
> >> Uncached fields would have to be written to memory on VM-exit. Or are
> >> all of the mutable fields in the internal VMCS cache?
> >
> > Hmm.  I can neither confirm nor deny?  The official Intel response to this
> > would be "it's microarchitectural".  I'll put it this way: it's in Intel's
> > best interest to minimize the latency of VMREAD, VMWRITE, VM-Enter and
> > VM-Exit.
>
> I will run some more experiments and get back to you. It is a shame that
> every experiment requires a (real) boot…

Yes! It's not just a shame; it's a serious usability issue.