On Thu, Feb 25, 2010 at 09:48:47AM +0100, Jan Kiszka wrote: > Marcelo Tosatti wrote: > > On Thu, Feb 25, 2010 at 12:58:26AM +0100, Jan Kiszka wrote: > >> Marcelo Tosatti wrote: > >>> On Thu, Feb 25, 2010 at 12:45:55AM +0100, Jan Kiszka wrote: > >>>> Marcelo Tosatti wrote: > >>>>> On Wed, Feb 24, 2010 at 03:17:55PM +0100, Jan Kiszka wrote: > >>>>>> Drop kvm_load_tsc in favor of level-dependent writeback in > >>>>>> kvm_arch_load_regs. KVM's PV clock MSRs fall in the same category and > >>>>>> should therefore only be written back on full sync. > >>>>>> > >>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@xxxxxxxxxxx> > >>>>>> --- > >>>>>> qemu-kvm-x86.c | 19 +++++-------------- > >>>>>> qemu-kvm.h | 4 ---- > >>>>>> target-i386/machine.c | 5 ----- > >>>>>> 3 files changed, 5 insertions(+), 23 deletions(-) > >>>>>> > >>>>>> diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c > >>>>>> index 840c1c9..84fd7fa 100644 > >>>>>> --- a/qemu-kvm-x86.c > >>>>>> +++ b/qemu-kvm-x86.c > >>>>>> @@ -965,8 +965,11 @@ void kvm_arch_load_regs(CPUState *env, int level) > >>>>>> set_msr_entry(&msrs[n++], MSR_LSTAR , env->lstar); > >>>>>> } > >>>>>> #endif > >>>>>> - set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME, env->system_time_msr); > >>>>>> - set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr); > >>>>>> + if (level == KVM_PUT_FULL_STATE) { > >>>>>> + set_msr_entry(&msrs[n++], MSR_IA32_TSC, env->tsc); > >>>>>> + set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME, env->system_time_msr); > >>>>>> + set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr); > >>>>>> + } > >>>>> As things stand today, the TSC should only be written on migration. See > >>>>> 53f658b3c33616a4997ee254311b335e59063289 in the kernel. > >>>> Migration and power-up - that's what this patch ensures (=> > >>>> KVM_PUT_FULL_STATE). Or where do you see any problem? > >>>> > >>>> Jan > >>>> > >>> The problem is it should not write on power up (the kernel attempts > >>> to synchronize the TSCs in that case, see the commit). > >>> > >> OK, need to read this more carefully. > >> > >> I do not yet understand the difference from user space POV: it tries to > >> transfer the identical TSC values to all currently stopped VCPU threads. > > > > guest tsc = host tsc + offset > > > > So at the time you set_msr(TSC), the guest visible TSC starts ticking. > > For SMP guests, this does not happen exactly at the same time for all > > vcpus. > > Ouch. > > > > >> That should not be different if we are booting a fresh VM or loading a > >> complete state of a migrated image. If it does, it looks like a KVM > >> kernel deficit on first glance. > > > > Yes it is a deficit. After migration TSCs of SMP guests go out of sync. > > Zachary is working on that. > > > > OK, so we need a workaround, ideally without reintroducing hooks. Is > this one acceptable? > > > diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c > index 84fd7fa..285c05a 100644 > --- a/qemu-kvm-x86.c > +++ b/qemu-kvm-x86.c > @@ -966,7 +966,15 @@ void kvm_arch_load_regs(CPUState *env, int level) > } > #endif > if (level == KVM_PUT_FULL_STATE) { > - set_msr_entry(&msrs[n++], MSR_IA32_TSC, env->tsc); > + /* > + * KVM is yet unable to synchronize TSC values of multiple VCPUs on > + * writeback. Until this is fixed, we only write the offset to SMP > + * guests after migration, desynchronizing the VCPUs, but avoiding > + * huge jump-backs that would occur without any writeback at all. > + */ > + if (smp_cpus == 1 || env->tsc != 0) { > + set_msr_entry(&msrs[n++], MSR_IA32_TSC, env->tsc); > + } > set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME, env->system_time_msr); > set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr); > } Well, migration of the SMP TSCs is not precise, but it is better than zeroing the TSCs on migration. So something like a new state (migration only), or a hack that mimicks that behaviour, is needed :( -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html