On Wed, Jun 08, 2022, Anton Romanov wrote: > Don't snapshot tsc_khz into per-cpu cpu_tsc_khz if the host TSC is > constant, in which case the actual TSC frequency will never change and thus > capturing TSC during initialization is unnecessary, KVM can simply use > tsc_khz. This value is snapshotted from > kvm_timer_init->kvmclock_cpu_online->tsc_khz_changed(NULL) > > On CPUs with constant TSC, but not a hardware-specified TSC frequency, > snapshotting cpu_tsc_khz and using that to set a VM's target TSC frequency > can lead to VM to think its TSC frequency is not what it actually is if > refining the TSC completes after KVM snapshots tsc_khz. The actual > frequency never changes, only the kernel's calculation of what that > frequency is changes. > > Ideally, KVM would not be able to race with TSC refinement, or would have > a hook into tsc_refine_calibration_work() to get an alert when refinement > is complete. Avoiding the race altogether isn't practical as refinement > takes a relative eternity; it's deliberately put on a work queue outside of > the normal boot sequence to avoid unnecessarily delaying boot. > > Adding a hook is doable, but somewhat gross due to KVM's ability to be > built as a module. And if the TSC is constant, which is likely the case > for every VMX/SVM-capable CPU produced in the last decade, the race can be > hit if and only if userspace is able to create a VM before TSC refinement > completes; refinement is slow, but not that slow. > > For now, punt on a proper fix, as not taking a snapshot can help some uses > cases and not taking a snapshot is arguably correct irrespective of the > race with refinement. > > Signed-off-by: Anton Romanov <romanton@xxxxxxxxxx> > Reviewed-by: Sean Christopherson <seanjc@xxxxxxxxxx> > --- Merged to kvm/queue, thanks! https://lore.kernel.org/all/Y4lHxds8pvBhxXFX@xxxxxxxxxx