From: Sean Christopherson <seanjc@xxxxxxxxxx> Sent: Monday, October 31, 2022 5:59 PM To: Jayaramappa, Srilakshmi Cc: kvm@xxxxxxxxxxxxxxx; pbonzini@xxxxxxxxxx; vkuznets@xxxxxxxxxx; mlevitsk@xxxxxxxxxx; suleiman@xxxxxxxxxx; Hunt, Joshua Subject: Re: KVM: x86: snapshotted TSC frequency causing time drifts in vms On Mon, Oct 31, 2022, Jayaramappa, Srilakshmi wrote: > Hi, > > We were recently notified of significant time drift on some of our virtual > machines. Upon investigation it was found that the jumps in time were larger > than ntp was able to gracefully correct. After further probing we discovered > that the affected vms booted with tsc frequency equal to the early tsc > frequency of the host and not the calibrated frequency. > > There were two variables that cached tsc_khz - cpu_tsc_khz and max_tsc_khz. > Caching max_tsc_khz would cause further scaling of the user_tsc_khz when the > vcpu is created after the host tsc calibrabration and kvm is loaded before > calibration. But it appears that Sean's commit "KVM: x86: Don't snapshot > "max" TSC if host TSC is constant" would fix that issue. [1] > > The cached cpu_tsc_khz is used in 1. get_kvmclock_ns() which incorrectly sets > the factors hv_clock.tsc_to_system_mul and hv_clock.shift that estimate > passage of time. 2. kvm_guest_time_update() > > We came across Anton Romanov's patch "KVM: x86: Use current rather than > snapshotted TSC frequency if it is constant" [2] that seems to address the > cached cpu_tsc_khz case. The patch description says "the race can be hit if > and only if userspace is able to create a VM before TSC refinement > completes". We think as long as the kvm module is loaded before the host tsc > calibration happens the vms can be created anytime and they will have the > problem (confirmed this by shutting down an affected vm and relaunching it - > it continued to experience time issues). VMs need not be created before tsc > refinement. > > Even if kvm module loads and vcpu is created before the host tsc refinement > and have incorrect time estimation on the vm until the tsc refinement, the > patches referenced here would subsequently provide the correct factors to > determine time. And any error in time in that small interval can be corrected > by ntp if it is running on the guest. If there was no ntp, the error would > probably be negligible and would not accumulate. > > There doesn't seem to be any response on the v6 of Anton's patch. I wanted to > ask if there is further changes in progress or if it is all set to be merged? Drat, it slipped through the cracks. Paolo, can you pick up the below patch? Oobviously assuming you don't spy any problems. It has a superficial conflict with commit 938c8745bcf2 ("KVM: x86: Introduce "struct kvm_caps" to track misc caps/settings"), but otherwise applies cleanly. > [2] https://urldefense.com/v3/__https://lore.kernel.org/all/20220608183525.1143682-1-romanton@xxxxxxxxxx/__;!!GjvTz_vk!QH6DrxJkEWcYdjwasd9zcBVokREj7lO9qb6tynY5SpQoRRXRxi959dCvoy_sbU9oRcrSbNCxXwA_dw$ Thanks, Sean! Appreciate it. -Sri