(a short update ...)
On 7/3/24 1:14 PM, Reinette Chatre wrote:
On 6/28/24 5:39 PM, Sean Christopherson wrote:
Forking this off to try and avoid confusion...
On Wed, Jun 12, 2024, Reinette Chatre wrote:
...
+
+ freq = (tmict - tmcct) * tdcrs[i].divide_count * tsc_hz / (tsc1 - tsc0);
+ /* Check if measured frequency is within 1% of configured frequency. */
+ GUEST_ASSERT(freq < apic_hz * 101 / 100);
+ GUEST_ASSERT(freq > apic_hz * 99 / 100);
+ }
This test fails on our SKX, CLX, and ICX systems due to what appears to be a CPU
bug. It looks like something APICv related is clobbering internal VMX timer state?
Or maybe there's a tearing or truncation issue?
It has been a few days. Just a note to let you know that we are investigating this.
On my side I have not yet been able to reproduce this issue. I tested
kvm-x86-next-2024.06.28 on an ICX and an CLX system by running 100 iterations of
apic_bus_clock_test and they all passed. Since I have lack of experience here there are
some Intel virtualization experts helping out with this investigation and I hope that
they will be some insights from the analysis and testing that you already provided.
I have now been able to test on SKX also and I am not yet able to reproduce. For
reference, the systems I tested on are:
SKX: https://ark.intel.com/content/www/us/en/ark/products/120507/intel-xeon-platinum-8170m-processor-35-75m-cache-2-10-ghz.html
ICX: https://ark.intel.com/content/www/us/en/ark/products/212459/intel-xeon-platinum-8360y-processor-54m-cache-2-40-ghz.html
CLX: https://ark.intel.com/content/www/us/en/ark/products/192476/intel-xeon-platinum-8260l-processor-35-75m-cache-2-40-ghz.html
Reinette