Hi Alexander, On Mon, May 05, 2014 at 03:51:22PM +0200, Alexander Graf wrote: > When we migrate we ask the kernel about its current belief on what the guest > time would be. KVM_GET_CLOCK which returns the time in "struct kvm_clock_data". > However, I've seen cases where the kvmclock guest structure > indicates a time more recent than the kvm returned time. More details please: 1) By what algorithm you retrieve and compare time in kvmclock guest structure and KVM_GET_CLOCK. What are the results of the comparison. And whether and backwards time was visible in the guest. 2) What is the host clocksource. The test below is not a good one because: T1) KVM_GET_CLOCK (save s->clock). T2) save env->tsc. The difference in scaled time between T1 and T2 is larger than 1 nanosecond, so the (time_at_migration > s->clock) check is almost always positive (what matters though is whether time backwards event can be seen reading kvmclock in the guest). > To make sure we never go backwards, calculate what the guest would have seen > as time at the point of migration and use that value instead of the kernel > returned one when it's more recent. > > While this doesn't fix the underlying issue that the kernel's view of time > is skewed, it allows us to safely migrate guests even from sources that are > known broken. > > Signed-off-by: Alexander Graf <agraf@xxxxxxx> > --- > hw/i386/kvm/clock.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 48 insertions(+) > > diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c > index 892aa02..c6521cf 100644 > --- a/hw/i386/kvm/clock.c > +++ b/hw/i386/kvm/clock.c > @@ -14,6 +14,7 @@ > */ > > #include "qemu-common.h" > +#include "qemu/host-utils.h" > #include "sysemu/sysemu.h" > #include "sysemu/kvm.h" > #include "hw/sysbus.h" > @@ -34,6 +35,47 @@ typedef struct KVMClockState { > bool clock_valid; > } KVMClockState; > > +struct pvclock_vcpu_time_info { > + uint32_t version; > + uint32_t pad0; > + uint64_t tsc_timestamp; > + uint64_t system_time; > + uint32_t tsc_to_system_mul; > + int8_t tsc_shift; > + uint8_t flags; > + uint8_t pad[2]; > +} __attribute__((__packed__)); /* 32 bytes */ > + > +static uint64_t kvmclock_current_nsec(KVMClockState *s) > +{ > + CPUState *cpu = first_cpu; > + CPUX86State *env = cpu->env_ptr; > + hwaddr kvmclock_struct_pa = env->system_time_msr & ~1ULL; > + uint64_t migration_tsc = env->tsc; > + struct pvclock_vcpu_time_info time; > + uint64_t delta; > + uint64_t nsec_lo; > + uint64_t nsec_hi; > + uint64_t nsec; > + > + if (!(env->system_time_msr & 1ULL)) { > + /* KVM clock not active */ > + return 0; > + } > + > + cpu_physical_memory_read(kvmclock_struct_pa, &time, sizeof(time)); > + > + delta = migration_tsc - time.tsc_timestamp; > + if (time.tsc_shift < 0) { > + delta >>= -time.tsc_shift; > + } else { > + delta <<= time.tsc_shift; > + } > + > + mulu64(&nsec_lo, &nsec_hi, delta, time.tsc_to_system_mul); > + nsec = (nsec_lo >> 32) | (nsec_hi << 32); > + return nsec + time.system_time; > +} > > static void kvmclock_vm_state_change(void *opaque, int running, > RunState state) > @@ -45,9 +87,15 @@ static void kvmclock_vm_state_change(void *opaque, int running, > > if (running) { > struct kvm_clock_data data; > + uint64_t time_at_migration = kvmclock_current_nsec(s); > > s->clock_valid = false; > > + if (time_at_migration > s->clock) { > + fprintf(stderr, "KVM Clock migrated backwards, using later time\n"); > + s->clock = time_at_migration; > + } > + > data.clock = s->clock; > data.flags = 0; > ret = kvm_vm_ioctl(kvm_state, KVM_SET_CLOCK, &data); > -- > 1.7.12.4 > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html