On Thu, 2021-12-02 at 13:26 +0800, zhenwei pi wrote: > On 12/2/21 10:48 AM, Thomas Gleixner wrote: > > On Wed, Dec 01 2021 at 10:46, zhenwei pi wrote: > > > If the host side supports APERF&MPERF feature, the guest side may get > > > mismatched frequency. > > > > > > KVM uses x86_get_cpufreq_khz() to get the same frequency for guest side. > > > > > > Signed-off-by: zhenwei pi <pizhenwei@xxxxxxxxxxxxx> > > > --- > > > arch/x86/kvm/x86.c | 4 +--- > > > 1 file changed, 1 insertion(+), 3 deletions(-) > > > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > index 5a403d92833f..125ed3c8b21a 100644 > > > --- a/arch/x86/kvm/x86.c > > > +++ b/arch/x86/kvm/x86.c > > > @@ -8305,10 +8305,8 @@ static void tsc_khz_changed(void *data) > > > > > > if (data) > > > khz = freq->new; > > > - else if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) > > > - khz = cpufreq_quick_get(raw_smp_processor_id()); > > > if (!khz) > > > - khz = tsc_khz; > > > + khz = x86_get_cpufreq_khz(raw_smp_processor_id()); > > > > my brain compiler tells me that this is broken. > > Without this patch: > 1, boot_cpu_has(X86_FEATURE_CONSTANT_TSC) is true: > no kvmclock_cpufreq_notifier, and khz = tsc_khz; > > 2, boot_cpu_has(X86_FEATURE_CONSTANT_TSC) is false: > during installing kmod, try cpufreq_quick_get(), or use tsc_khz; > and get changed by kvmclock_cpufreq_notifier. > > With this patch: > 1, boot_cpu_has(X86_FEATURE_CONSTANT_TSC) is true: > no kvmclock_cpufreq_notifier, try aperf/mperf, or try > cpufreq_quick_get(), or use cpu_khz > > 2, boot_cpu_has(X86_FEATURE_CONSTANT_TSC) is false: > during installing kmod, try aperf/mperf, or try cpufreq_quick_get(), or > use cpu_khz; > and get changed by kvmclock_cpufreq_notifier. > > I tested on Skylake&Icelake CPU, and got different CPU frequency from > host & guest, the main purpose of this patch is to get the same frequency. > Note that on my Zen2 machine (3970X), aperf/mperf returns current cpu freqency, as now see in /proc/cpuinfo, while TSC is always running with base CPU clock frequency (3.7 GHZ) (that is max frequency that CPU is guranteed to run with, anything above is boost 'bonus') [mlevitsk@starship ~/Kernel/br-vm-64/src]$cat /proc/cpuinfo | grep "cpu MHz" cpu MHz : 3685.333 cpu MHz : 2200.000 cpu MHz : 2200.000 cpu MHz : 2200.000 cpu MHz : 2200.000 cpu MHz : 2200.000 cpu MHz : 2200.000 cpu MHz : 2200.000 cpu MHz : 2200.000 cpu MHz : 2200.000 cpu MHz : 2200.000 cpu MHz : 2761.946 cpu MHz : 2200.000 cpu MHz : 2200.000 cpu MHz : 2200.000 ... [mlevitsk@starship ~/Kernel/master/src]$dmesg | grep tsc [ 0.000000] tsc: Fast TSC calibration using PIT [ 0.000000] tsc: Detected 3700.230 MHz processor ... Before I forget about it I do want to point out few things that are not 100% related to this thread but do related to TSC: 1. It sucks that on AMD, the TSC frequency is calibrated from other clocksources like PIT/HPET, since the result is not exact and varies from boot to boot. I do wonder if they have something like that APERF/MPERF thing which sadly is not what I was looking for. 2. In the guest on AMD, we mark the TSC as unsynchronized always due to the code in unsynchronized_tsc, unless invariant tsc is used in guest cpuid, which is IMHO not fair to AMD as we don't do this for Intel cpus. (look at unsynchronized_tsc function) 3. I wish the kernel would export the tsc frequency it found to userspace somewhere in /sys or /proc, as this would be very useful for userspace applications. Currently it can only be found in dmesg if I am not mistaken.. I don't mind if such frequency would only be exported if the TSC is stable, always running, not affected by CPUfreq, etc. Best regards, Maxim Levitsky