[PATCHv1] kvm guest: fix uninitialized kvmclock read by KVM guest

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Due to unintialized kvmclock read KVM guest is hanging on SMP boot stage.
If unintialized memory contains fatal garbage then hang reproduction is 100%.
Unintialized memory is allocated by memblock_alloc. So the garbage values
depend on many many things.

See the detailed description of the bug and possible ways to fix it
in the kernel bug tracker.
"Bug 59521 - KVM linux guest reads uninitialized pvclock values before
executing rdmsr MSR_KVM_WALL_CLOCK"

I decided to fix it simply returning 0ULL from kvm_clock_read when
kvm clocksource is not initialized yet.
The same as kernel bootstrap CPU doesn on boot stage when kernel
clocksources are not initialized yet.

Signed-off-by: Eugene Batalov <ebatalov@xxxxxxxxxxxxx>
---
I dont' use kernel percpu variables because for each SMP CPU
their contents are copied from the bootstrap CPU. And I don't
think that fixing the value for each SMP CPU is a good style.
If you know a better approach to store the is_pv_clock_ready flags
I am ready to use it.

The patch applies cleanly to
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

I've tested the changes with Ubuntu 13.04 "raring" userspace and
Ubuntu-3.8.0.19-30 kernel tag.

 arch/x86/kernel/kvmclock.c |   13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 5bedbdd..a6e0af4 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -43,6 +43,9 @@ early_param("no-kvmclock", parse_no_kvmclock);
 static struct pvclock_vsyscall_time_info *hv_clock;
 static struct pvclock_wall_clock wall_clock;
 
+/* For each cpu store here a flag which tells whether pvclock is initialized */
+static int __cacheline_aligned_in_smp is_pv_clock_ready[NR_CPUS] = {};
+
 /*
  * The wallclock is the time of day when we booted. Since then, some time may
  * have elapsed since the hypervisor wrote the data. So we try to account for
@@ -84,8 +87,11 @@ static cycle_t kvm_clock_read(void)
 
 	preempt_disable_notrace();
 	cpu = smp_processor_id();
-	src = &hv_clock[cpu].pvti;
-	ret = pvclock_clocksource_read(src);
+	if (is_pv_clock_ready[cpu]) {
+		src = &hv_clock[cpu].pvti;
+		ret = pvclock_clocksource_read(src);
+	} else
+		ret = 0ULL;
 	preempt_enable_notrace();
 	return ret;
 }
@@ -168,6 +174,9 @@ int kvm_register_clock(char *txt)
 	printk(KERN_INFO "kvm-clock: cpu %d, msr %x:%x, %s\n",
 	       cpu, high, low, txt);
 
+	if (!ret)
+		is_pv_clock_ready[cpu] = 1;
+
 	return ret;
 }
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux