Andre Przywara wrote:
Avi Kivity wrote:
On 07/27/2010 04:48 PM, Andre Przywara wrote:
Wierd. Maybe the clock goes crazy.
Let's see if it jumps forward alot:
} while (unlikely(last != ret));
+
+ {
+ static u64 last_report;
+ if (ret > last_report + 10000) {
+ last_report = ret;
+ printk("kvmclock: %llx\n", ret);
+ }
+
+ }
return ret;
}
Worth updating the 'return last' to update ret and goto the new code,
so we don't miss that path.
Did that. There is _a lot_ of output (about 350 lines per second via
the 115k serial console), both with smp=1 and smp=2.
The majority is differing about 2,000,000 (ticks?), but a handful of
them are in the range of 20 million.
nanoseconds. So 2-20ms. Consistent with 350 lines/sec.
No difference between smp=2 and smp=1.
I also get some "BUG: recent printk recursion!" and I don't see any
kernel boot progress beyond outputting the BogoMIPS value.
Right, printk() wants the time too.
BTW: I found two message from your earlier debug statement:
[ 0.000000] kvm-clock: cpu 0, msr 0:1ac0401, boot clock
[ 0.000000] kvm-clock: cpu 0, msr 0:1e15401, primary cpu clock
Those are from kvmclock initialization, not from the older patch.
I'm completely confused, everything seems to be in order.
Let's see. if you s/return last/return ret/ in the original, does this
help things along? this makes pvclock drop the computation and should
be exactly the same as before the patch.
Yes, this works, both smp version boot. I see a short very short break
after the line in question, but then it proceeds well.
Thanks for your help, now I got a much better insight into the issue. I
will see if I can find something more.
Did some more investigations, some observations:
- The cmpxchg does not seem to be a problem, I didn't see the loop
iterated more than once.
- Turning off printk-timestamps makes the bug go away. But I guess it is
just hiding or deferring it, and it's no real workaround anyway.
- I instrumented the "if (ret < last) return last;" statement, when the
kernel hangs I get only printks from there, although it has hit before:
----------
[ 0.820000] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[ 0.820000] returning last instead (cnt=19001)
[ 0.820000] returning last instead (cnt=20001)
The last line repeats forever with the same timestamp, the counter
(counting the number of "return last;") increments about 3500 times/second.
I will see if I find something more...
Regards,
Andre.
--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 448-3567-12
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html