> On Jan 28, 2020, at 10:33 AM, Sean Christopherson <sean.j.christopherson@xxxxxxxxx> wrote: > > On Tue, Jan 28, 2020 at 09:59:45AM -0800, Jim Mattson wrote: >> On Mon, Jan 27, 2020 at 12:56 PM Sean Christopherson >> <sean.j.christopherson@xxxxxxxxx> wrote: >>> On Mon, Jan 27, 2020 at 11:24:31AM -0800, Jim Mattson wrote: >>>> On Sun, Jan 26, 2020 at 8:36 PM Nadav Amit <nadav.amit@xxxxxxxxx> wrote: >>>>>> On Jan 26, 2020, at 2:06 PM, Jim Mattson <jmattson@xxxxxxxxxx> wrote: >>>>>> >>>>>> If I had to guess, you probably have SMM malware on your host. Remove >>>>>> the malware, and the test should pass. >>>>> >>>>> Well, malware will always be an option, but I doubt this is the case. >>>> >>>> Was my innuendo too subtle? I consider any code executing in SMM to be malware. >>> >>> SMI complications seem unlikely. The straw that broke the camel's back >>> was a 1152 cyle delta, presumably the other failing runs had similar deltas. >>> I've never benchmarked SMI+RSM, but I highly doubt it comes anywhere close >>> to VM-Enter/VM-Exit's super optimized ~400 cycle round trip. E.g. I >>> wouldn't be surprised if just SMI+RSM is over 1500 cycles. >> >> Good point. What generation of hardware are you running on, Nadav? > > Skylake. Indeed. Thanks for answering on my behalf ;-) > >>>>> Interestingly, in the last few times the failure did not reproduce. Yet, >>>>> thinking about it made me concerned about MTRRs configuration, and that >>>>> perhaps performance is affected by memory marked as UC after boot, since >>>>> kvm-unit-test does not reset MTRRs. >>>>> >>>>> Reading the variable range MTRRs, I do see some ranges marked as UC (most of >>>>> the range 2GB-4GB, if I read the MTRRs correctly): >>>>> >>>>> MSR 0x200 = 0x80000000 >>>>> MSR 0x201 = 0x3fff80000800 >>>>> MSR 0x202 = 0xff000005 >>>>> MSR 0x203 = 0x3fffff000800 >>>>> MSR 0x204 = 0x38000000000 >>>>> MSR 0x205 = 0x3f8000000800 >>>>> >>>>> Do you think we should set the MTRRs somehow in KVM-unit-tests? If yes, can >>>>> you suggest a reasonable configuration? >>>> >>>> I would expect MTRR issues to result in repeatable failures. For >>>> instance, if your VMCS ended up in UC memory, that might slow things >>>> down quite a bit. But, I would expect the VMCS to end up at the same >>>> address each time the test is run. >>> >>> Agreed on the repeatable failures part, but putting the VMCS in UC memory >>> shouldn't affect this type of test. The CPU's internal VMCS cache isn't >>> coherent, and IIRC isn't disabled if the MTRRs for the VMCS happen to be >>> UC. >> >> But the internal VMCS cache only contains selected fields, doesn't it? >> Uncached fields would have to be written to memory on VM-exit. Or are >> all of the mutable fields in the internal VMCS cache? > > Hmm. I can neither confirm nor deny? The official Intel response to this > would be "it's microarchitectural". I'll put it this way: it's in Intel's > best interest to minimize the latency of VMREAD, VMWRITE, VM-Enter and > VM-Exit. I will run some more experiments and get back to you. It is a shame that every experiment requires a (real) boot…