On Mon, Jun 27, 2022 at 11:32 PM <bugzilla-daemon@xxxxxxxxxx> wrote: > > https://bugzilla.kernel.org/show_bug.cgi?id=216177 > > --- Comment #9 from Yang Lixiao (lixiao.yang@xxxxxxxxx) --- > (In reply to Jim Mattson from comment #8) > > On Mon, Jun 27, 2022 at 8:54 PM Nadav Amit <nadav.amit@xxxxxxxxx> wrote: > > > > > The failure on bare-metal that I experienced hints that this is either a > > test > > > bug or (much less likely) a hardware bug. But I do not think it is likely > > to > > > be > > > a KVM bug. > > > > KVM does not use the VMX-preemption timer to virtualize L1's > > VMX-preemption timer (and that is why KVM is broken). The KVM bug was > > introduced with commit f4124500c2c1 ("KVM: nVMX: Fully emulate > > preemption timer"), which uses an L0 CLOCK_MONOTONIC hrtimer to > > emulate L1's VMX-preemption timer. There are many reasons that this > > cannot possibly work, not the least of which is that the > > CLOCK_MONOTONIC timer is subject to time slew. > > > > Currently, KVM reserves L0's VMX-preemption timer for emulating L1's > > APIC timer. Better would be to determine whether L1's APIC timer or > > L1's VMX-preemption timer is scheduled to fire first, and use L0's > > VMX-preemption timer to trigger a VM-exit on the nearest alarm. > > Alternatively, as Sean noted, one could perhaps arrange for the > > hrtimer to fire early enough that it won't fire late, but I don't > > really think that's a viable solution. > > > > I can't explain the bare-metal failures, but I will note that the test > > assumes the default treatment of SMIs and SMM. The test will likely > > fail with the dual-monitor treatment of SMIs and SMM. Aside from the > > older CPUs with broken VMX-preemption timers, I don't know of any > > relevant errata. > > > > Of course, it is possible that the test itself is buggy. For the > > person who reported bare-metal failures on Ice Lake and Cooper Lake, > > how long was the test in VMX non-root mode past the VMX-preemption > > timer deadline? > > On the first Ice lake: > Test suite: vmx_preemption_timer_expiry_test > FAIL: Last stored guest TSC (28067103426) < TSC deadline (28067086048) > > On the second Ice lake: > Test suite: vmx_preemption_timer_expiry_test > FAIL: Last stored guest TSC (27014488614) < TSC deadline (27014469152) > > On Cooper lake: > Test suite: vmx_preemption_timer_expiry_test > FAIL: Last stored guest TSC (29030585690) < TSC deadline (29030565024) Wow! Those are *huge* overruns. What is the value of MSR 0x9B on these hosts?