On Fri, May 12, 2017 at 04:13:23PM +0200, Radim Krčmář wrote: > 2017-05-11 12:39-0300, Marcelo Tosatti: > > On Wed, May 10, 2017 at 08:04:31PM +0200, Radim Krčmář wrote: > > > 2017-05-03 10:43-0300, Marcelo Tosatti: > > > and the important fix for kvm master clock is the move of > > > kvm_gen_update_masterclock() before we read the time. > > > > The rest is just a minor optimization that also ignores time since > > > master_kernel_ns() and therefore pins user_ns.clock to a slightly > > > earlier time. > > > > > > But all attention was given to the "minor optimization" -- have I missed > > > something about the direct use of ka->master_kernel_ns? > > > > I haven't attempted to optimize anything. Not sure what you mean. > > I mean, why doesn't the patch look like this? d > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 464da936c53d..8db1d09e59d7 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -4175,9 +4175,10 @@ long kvm_arch_vm_ioctl(struct file *filp, > goto out; > > r = 0; > + kvm_gen_update_masterclock(kvm); > now_ns = get_kvmclock_ns(kvm); > kvm->arch.kvmclock_offset += user_ns.clock - now_ns; > - kvm_gen_update_masterclock(kvm); > + kvm_make_all_cpus_request(kvm, KVM_REQ_CLOCK_UPDATE); > break; now_ns = ka->master_kernel_ns + kvmclock_offset_prev + (grdtsc() - ka->master_cycle_now) kvmclock_offset += user_ns.clock - (ka->master_kernel_ns + kvmclock_offset_prev + (grdtsc() - ka->master_cycle_now) kvmclock_offset = kvmclock_offset_prev + user_ns.clock - (ka->master_kernel_ns + kvmclock_offset_prev + (grdtsc() - ka->master_cycle_now) In case of VM was just initialized before migration, kvmclock_offset_prev is -ktime_get_boot_ns() kvmclock_offset = -ktime_get_boot_ns() + user_ns.clock - (ka->master_kernel_ns -ktime_get_boot_ns() + grdtsc() - ka->master_cycle_now)) But master_kernel_ns = ktime_get_boot_ns() + delta-between-vm-init-and-KVM_SET_CLOCK (AKA delta) (the same one from VM init) kvmclock_offset = -ktime_get_boot_ns() + user_ns.clock - (ktime_get_boot_ns() + delta + -ktime_get_boot_ns() + grdtsc() - ka->master_cycle_now)) kvmclock_offset = -ktime_get_boot_ns() + user_ns.clock - delta - grdtsc() + ka->master_cycle_now But we don't want grdtsc() - ka->master_cycle_now in there. Note: grdtsc() == guest read tsc. Now with + kvm->arch.kvmclock_offset = user_ns.clock - + ka->master_kernel_ns; What happens is that guest clock starts counting, via kernel timekeeper, at the moment kvm_get_time_and_clockread() runs. If you add grdtsc() - ka->master_cycle_now in there, you are mindfully counting clock twice (first: kernel timekeeper, second: the TSC between the (grdtsc() - ka->master_cycle_now) in question. + kvm->arch.kvmclock_offset = -ktime_get_boot_ns() +user_ns.clock -delta Note that (grdtsc() - ka->master_cycle_now) is susceptible to scheduling etc. Makes sense?