Zachary Amsden wrote:
On 07/28/2010 02:25 AM, Andre Przywara wrote:
Andre Przywara wrote:
Andre Przywara wrote:
Avi Kivity wrote:
On 07/27/2010 04:48 PM, Andre Przywara wrote:
Wierd. Maybe the clock goes crazy.
Let's see if it jumps forward alot:
} while (unlikely(last != ret));
+
+ {
+ static u64 last_report;
+ if (ret > last_report + 10000) {
+ last_report = ret;
+ printk("kvmclock: %llx\n", ret);
+ }
+
+ }
return ret;
}
Worth updating the 'return last' to update ret and goto the new
code, so we don't miss that path.
Did that. There is _a lot_ of output (about 350 lines per second
via the 115k serial console), both with smp=1 and smp=2.
The majority is differing about 2,000,000 (ticks?), but a handful
of them are in the range of 20 million.
nanoseconds. So 2-20ms. Consistent with 350 lines/sec.
No difference between smp=2 and smp=1.
I also get some "BUG: recent printk recursion!" and I don't see
any kernel boot progress beyond outputting the BogoMIPS value.
Right, printk() wants the time too.
BTW: I found two message from your earlier debug statement:
[ 0.000000] kvm-clock: cpu 0, msr 0:1ac0401, boot clock
[ 0.000000] kvm-clock: cpu 0, msr 0:1e15401, primary cpu clock
Those are from kvmclock initialization, not from the older patch.
I'm completely confused, everything seems to be in order.
Let's see. if you s/return last/return ret/ in the original, does
this help things along? this makes pvclock drop the computation
and should be exactly the same as before the patch.
Yes, this works, both smp version boot. I see a short very short
break after the line in question, but then it proceeds well.
Thanks for your help, now I got a much better insight into the
issue. I will see if I can find something more.
Did some more investigations, some observations:
- The cmpxchg does not seem to be a problem, I didn't see the loop
iterated more than once.
- Turning off printk-timestamps makes the bug go away. But I guess it
is just hiding or deferring it, and it's no real workaround anyway.
- I instrumented the "if (ret < last) return last;" statement, when
the kernel hangs I get only printks from there, although it has hit
before:
----------
[ 0.820000] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[ 0.820000] returning last instead (cnt=19001)
[ 0.820000] returning last instead (cnt=20001)
The last line repeats forever with the same timestamp, the counter
(counting the number of "return last;") increments about 3500
times/second.
I will see if I find something more...
Added some more instrumentation, seems like the values read from the
pvclock is bogus *sometimes*:
returning last instead (2778021535795841, cnt=1, diff=1389078312510470)
This is from the first time the if-statement triggers. So I guess the
value read is ridiculously far in the future (multiple days), so next
calls to clocksource_read() will always return this bogus last value.
This means that the clock does not make progress (for several days)
and thus timing loops will never come to an end. I also instrumented
the serial driver, the last thing I saw was autoconfig_irq, where
obviously udelay() is called.
Does that ring a bell with someone?
I will now concentrate on the pvclock readout/HV write part to see
which of the values used here are wrong.
Have you gotten any further results on this?
Somehow. I think my latest findings were more or less ghost bugs: since
prinkts contain a timestamp they interfere with the actual code. The
large gap I described above was only to be seen with these printks, it
is more or less double the real value (which is my host's uptime).
Sadly I cannot use debugfs to avoid the printks, since the kernel halts
and I don't get to userland.
On another try I managed to bisect the failure also in qemu-kvm. The bug
triggers only with "ebc4f45 turn off kvmclock when resetting cpu"
applied (_additionally_ to the kernel patch in question).
When I comment out the call to kvm_reset_msrs() in the master branch,
this also lets the bug vanish.
I think the most likely explanation is that your host CPU has TSC out of
sync, and somehow this leaks over to pvclock. Am I correct that it
happens even with one guest VCPU? What if you disable secondary host CPUs?
I tried several ways to pin VCPUs to different host CPUs (cores and
sockets): both VCPUs on one core, both vCPUs on different cores on the
same socket and both vCPUs to different sockets/nodes. That all did not
make any difference, the kernel halted in either case.
I also tried booting the host with maxcpus=1, the error was still the
same: -smp 1 works, -smp 2 halts.
Btw.: the host uses clocksource acpi_pm. Also I noticed that sometimes
the guests gets very slow after having switched the clocksource to
kvmclock, it then eventually halts at the mentioned line.
Regards,
Andre.
--
Andre Przywara
AMD-OSRC (Dresden)
Tel: x29712
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html