On Thu, Jul 07, 2022, Simon Veith wrote: > The Time Stamp Counter (TSC) value register can be set to an absolute > value using the KVM_SET_MSRS ioctl, which calls kvm_synchronize_tsc() > internally. > > Since this is a per-vCPU register, and vCPUs are often managed by > separate threads, setting a uniform TSC value across all vCPUs is > challenging: After live migration, for example, the TSC value may need > to be adjusted to account for the migration downtime. Ideally, we would > want each vCPU to be adjusted by the same offset; but if we compute the > offset centrally, the TSC value may become out of date due to scheduling > delays by the time that each vCPU thread gets around to issuing > KVM_SET_MSRS. > > In preparation for the next patch, this change adds an optional, KVM > clock based time reference argument to kvm_synchronize_tsc(). This > argument, if present, is understood to mean "the TSC value being written > was valid at this corresponding KVM clock time point". Given that commit 828ca89628bf ("KVM: x86: Expose TSC offset controls to userspace") was merged less than a year ago, it would be helpful to explicitly call out why KVM_VCPU_TSC_CTRL doesn't work, and why that sub-ioctl can't be extended/modified to make it work. > kvm_synchronize_tsc() will then use this clock reference to adjust the > TSC value being written for any delays that have been incurred since the > provided TSC value was valid. > > Co-developed-by: David Woodhouse <dwmw@xxxxxxxxxxxx> Needs David's SOB. And this patch should be squashed with the next patch. The kvm_ns != NULL path is dead code here, e.g. even if the logic is wildly broken, bisection will blame the patch that actually passes a non-null reference. Neither patch is big enough to warrant splitting in this way. And more importantly, the next patch's changelog provides a thorough description of what it's doing, but there's very little in there that describes _why_ KVM wants to provide this functionality. > Signed-off-by: Simon Veith <sveith@xxxxxxxxx> > --- > arch/x86/kvm/x86.c | 22 +++++++++++++++++----- > 1 file changed, 17 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 1910e1e78b15..a44d083f1bf9 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -2629,7 +2629,7 @@ static void __kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 offset, u64 tsc, > kvm_track_tsc_matching(vcpu); > } > > -static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 data) > +static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 data, u64 *kvm_ns) > { > struct kvm *kvm = vcpu->kvm; > u64 offset, ns, elapsed; > @@ -2638,12 +2638,24 @@ static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 data) > bool synchronizing = false; > > raw_spin_lock_irqsave(&kvm->arch.tsc_write_lock, flags); > - offset = kvm_compute_l1_tsc_offset(vcpu, data); > ns = get_kvmclock_base_ns(); > + > + if (kvm_ns) { > + /* > + * We have been provided a KVM clock reference time point at Avoid pronouns, e.g. there's no need to say "We have been provided", just state what kvm_ns is and how it's used. > + * which this TSC value was correct. > + * Use this time point to compensate for any delays that were > + * incurred since that TSC value was valid. > + */ > + s64 delta_ns = ns + vcpu->kvm->arch.kvmclock_offset - *kvm_ns; > + data += nsec_to_cycles(vcpu, (u64)delta_ns); > + }