On Tue, May 04, 2010 at 02:35:28PM -0400, Glauber Costa wrote: > Currently, in the linux kernel, we reset kvmclock if we are rebooting > into a crash kernel through kexec. The rationale, is that a new kernel > won't follow the same memory addresses, and the memory where kvmclock is > located in the first kernel, will be something else in the second one. > > We don't do it in normal reboots, because the second kernel ends up > registering kvmclock again, which has the effect of turning off the > first instance. > > This is, however, totally wrong. This assumes we're booting into > a kernel that also has kvmclock enabled. If by some reason we reboot > into something that doesn't do kvmclock including but not limited to: > * rebooting into an older kernel without kvmclock support, > * rebooting with no-kvmclock, > * rebootint into another O.S, > > we'll simply have the hypervisor writing into a random memory position > into the guest. Neat, uh? > > Moreover, I believe the fix belongs in qemu, since it is the entity > more prepared to detect all kinds of reboots (by means of a cpu_reset), > not to mention the presence of misbehaving guests, that can forget > to turn kvmclock off. > > This patch fixes the issue for me. > > Signed-off-by: Glauber Costa <glommer@xxxxxxxxxx> > --- > qemu-kvm-x86.c | 19 +++++++++++++++++++ > 1 files changed, 19 insertions(+), 0 deletions(-) > > diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c > index 439c31a..4b94e04 100644 > --- a/qemu-kvm-x86.c > +++ b/qemu-kvm-x86.c > @@ -1417,8 +1417,27 @@ void kvm_arch_push_nmi(void *opaque) > } > #endif /* KVM_CAP_USER_NMI */ > > +static int kvm_turn_off_clock(CPUState *env) > +{ > + struct { > + struct kvm_msrs info; > + struct kvm_msr_entry entries[100]; > + } msr_data; > + > + struct kvm_msr_entry *msrs = msr_data.entries; > + int n = 0; > + > + kvm_msr_entry_set(&msrs[n++], MSR_KVM_SYSTEM_TIME, 0); > + kvm_msr_entry_set(&msrs[n++], MSR_KVM_WALL_CLOCK, 0); > + msr_data.info.nmsrs = n; > + > + return kvm_vcpu_ioctl(env, KVM_SET_MSRS, &msr_data); > +} > + > + > void kvm_arch_cpu_reset(CPUState *env) > { > + kvm_turn_off_clock(env); Better to zero env->system_time_msr and wall_clock_msr here, and modify kvm_arch_load_regs to write those MSRs on KVM_PUT_RESET_STATE (to be conformant with the new writeback scheme). -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html