On Thu, Mar 30, 2023, Thomas Huth wrote: > On 29/03/2023 21.11, Sean Christopherson wrote: > > On Wed, Mar 29, 2023, Thomas Huth wrote: > > > > > > Hi, > > > > > > I noticed that in recent builds, the "memory" test started failing in the > > > kvm-unit-test CI. After doing some experiments, I think it might rather be > > > related to the environment than to a recent change in the k-u-t sources. > > > > > > It used to work fine with commit 2480430a here in January: > > > > > > https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/3613156199#L2873 > > > > > > Now I've re-run the CI with the same commit 2480430a here and it is failing now: > > > > > > https://gitlab.com/thuth/kvm-unit-tests/-/jobs/4022074711#L2733 > > > > Can you provide the logs from the failing test, and/or the build artifacts? I > > tried, and failed, to find them on Gitlab. > > Yes, that's still missing in the CI scripts ... I'll try to come up with a > patch that provides the logs as artifacts. > > Meanwhile, here's a run with a manual "cat logs/memory.log": > > https://gitlab.com/thuth/kvm-unit-tests/-/jobs/4029213352#L2726 > > Seems like these are the failing memory tests: > > FAIL: clflushopt (ABSENT) > FAIL: clwb (ABSENT) More than likely what is happening is that the platform supports CLFLUSHOPT and CLWB (possibly even via a ucode patch update), but the CPUID bits are not being enumerated to the guest. Neither VMX nor SVM has intercept controls for the instructions, so KVM has no way to enforce the the guest's CPUID model. E.g. the failures can be reproduce by manually hiding the features: rkt ./x86/run x86/memory.flat -smp 1 -cpu max,-clflushopt,-clwb This isn't a KVM bug because of the virtualization hole. And really, the test itself is bogus when running on KVM precisely because of said hole (similar holes exist for all the other instructions in the test). The test appears to have been added for QEMU's TCG, which makes sense as there shouldn't be any virtualization holes in a pure emulation environment. That said, it is interesting that the test is suddenly failing, as it means something is buggy. If you can run commands on the host, check for host support via /proc/cpuinfo. If those come back negative (no support), then it would appear that hardware or the host kernel is in a bad/unexpected state. grep -q clflushopt /proc/cpuinfo grep -q clwb /proc/cpuinfo