On 07/30/2010 02:34 PM, Arjan Koers wrote:
On 2010-07-28 12:37, Avi Kivity wrote:
On 07/28/2010 12:00 AM, Arjan Koers wrote:
On 2010-07-26 20:59, Arjan Koers wrote:
I ran into the same problem. 2.6.34.1 and 2.6.35-rc6 SMP guest
kernels hang during boot.
It appears that last is way ahead of ret twice.
The kernel boots with this debug patch that makes the clock go
backwards if the difference is big:
last = atomic64_read(&last_value);
do {
- if (ret< last)
- return last;
+ if (ret< last) {
+ if ( last - ret< 25000000 )
+ return last;
+ else
+ printk("pvclock backwards: ret = %llx; last =
%llx\n", ret, last);
+ }
last = atomic64_cmpxchg(&last_value, last, ret);
} while (unlikely(last != ret));
[ 0.037122] Total of 2 processors activated (11198.08 BogoMIPS).
[ 0.037118] x86 PAT enabled: cpu 1, old 0x0, new 0x7010600070106
[ 0.040000] pvclock backwards: ret = 108373705fe2; last = 210aff61470a
Zaaaacchhhh?!
The lists below show some debug data of the first 99 calls to
pvclock_clocksource_read since the kernel booted. The situation
after the 'do ... while (version != src->version)' loop is
displayed.
Meaning of the columns:
- src pointer
- shadow.tsc_timestamp
- shadow.system_timestamp
- shadow.version
- native_read_tsc()
- delta = native_read_tsc() - shadow.tsc_timestamp
- offset = scale_delta(delta, shadow.tsc_to_nsec_mul, shadow.tsc_shift)
- ret = shadow.system_timestamp + offset
Fields left out, because they were the same for all rows:
- shadow.tsc_to_nsec_mul: b6dc43b6
- shadow.tsc_shift: ffffffff
- shadow.flags: 0
Debug log of guest after cold boot of virtual machine:
1: ffff880001411c00 2107d5a4e b42c01d704c6 8294 210d8d4b5 5b7a67 20abdc b42c01f7b0a2
2: ffff880001411c00 2107d5a4e b42c01d704c6 8294 210dc2b61 5ed113 21dd1b b42c01f8e1e1
3: ffff880001411c00 21cb0d4a8 b42c0632768f bb70 21cb10a00 3558 130d b42c0632899c
4: ffff880001411c00 21cb0d4a8 b42c0632768f bb70 21cb11f17 4a6f 1a95 b42c06329124
5: ffff880001411c00 21cceaad2 b42c063d1e45 bbd8 21ccec522 1a50 965 b42c063d27aa
6: ffff880001411c00 21cde0644 b42c06429a42 bc10 21ce25457 44e13 1899a b42c064423dc
7: ffff880001411c00 21cf905c1 b42c064c3e76 bc46 21cfa182b 1126a 6201 b42c064ca077
8: ffff880001411c00 21d088194 b42c0651c601 bc7a 21d089592 13fe 723 b42c0651cd24
9: ffff880001411c00 21d1ad073 b42c06584fc3 bcde 21d1b135d 42ea 17e5 b42c065867a8
10: ffff880001411c00 21d2a3837 b42c065dd039 bd10 21d2a4825 fee 5b0 b42c065dd5e9
11: ffff880001411c00 21d38bab3 b42c0662fea6 bd42 21d38caa1 fee 5b0 b42c06630456
12: ffff880001411c00 21d47459b b42c06683029 bd78 21d475517 f7c 587 b42c066835b0
13: ffff880001411c00 21d578ce7 b42c066e005f bdb2 21d57d70c 4a25 1a7a b42c066e1ad9
14: ffff880001411c00 21d578ce7 b42c066e005f bdb2 21d57d8d6 4bef 1b1e b42c066e1b7d
15: ffff880001411c00 21d578ce7 b42c066e005f bdb2 21d57da22 4d3b 1b94 b42c066e1bf3
16: ffff880001411c00 21d578ce7 b42c066e005f bdb2 21d57fc5e 6f77 27ce b42c066e282d
17: ffff880001411c00 21d67c77c b42c0673cc0a bde4 21d67d685 f09 55e b42c0673d168
18: ffff880001411c00 21d7625b2 b42c0678ed96 be16 21d763488 ed6 54c b42c0678f2e2
19: ffff880001411c00 21df3db36 b42c06a5d222 be54 21dfa78b9 69d83 25cd5 b42c06a82ef7
20: ffff880001411c00 21df3db36 b42c06a5d222 be54 21dfa7a3f 69f09 25d61 b42c06a82f83
21: ffff880001411c00 21df3db36 b42c06a5d222 be54 21dfa7f8b 6a455 25f45 b42c06a83167
22: ffff880001411c00 21e3a50ea b42c06befbb1 be58 21e3c1750 1c666 a249 b42c06bf9dfa
23: ffff880001411c00 21e4bfe47 b42c06c54bc5 be92 21e4c4c61 4e1a 1be4 b42c06c567a9
24: ffff880001411c00 21ea2a56e b42c06e43dbc beca 21ea4b224 20cb6 bb66 b42c06e4f922
25: ffff880001411c00 21ea2a56e b42c06e43dbc beca 21ea52748 281da e53c b42c06e522f8
26: ffff880001411c00 21ea2a56e b42c06e43dbc beca 21ea52907 28399 e5db b42c06e52397
27: ffff880001411c00 21ea2a56e b42c06e43dbc beca 21ea52a76 28508 e65f b42c06e5241b
28: ffff880001411c00 21ea2a56e b42c06e43dbc beca 21ea5c86a 322fc 11ec9 b42c06e55c85
29: ffff880001411c00 21ea2a56e b42c06e43dbc beca 21ea60e3a 368cc 137b7 b42c06e57573
30: ffff880001411c00 21ea2a56e b42c06e43dbc beca 21ea64dc8 3a85a 14e6a b42c06e58c26
31: ffff880001411c00 21ed8a003 b42c06f78496 bf02 21efda28b 250288 d37d2 b42c0704bc68
32: ffff880001411c00 21f0e9488 b42c070ac93f bf38 21f0eacdb 1853 8af b42c070ad1ee
33: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230aeeac6 3e60 1646 b42c0d5636ed
34: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230af06d0 5a6a 204a b42c0d5640f1
35: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b03f25 192bf 8fd6 b42c0d56b07d
36: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b043c8 19762 917f b42c0d56b226
37: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b0526b 1a605 96b8 b42c0d56b75f
38: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b05632 1a9cc 9812 b42c0d56b8b9
39: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b07eaa 1d244 a686 b42c0d56c72d
40: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b094e9 1e883 ae78 b42c0d56cf1f
41: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b09962 1ecfc b011 b42c0d56d0b8
42: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b10590 2592a d6b4 b42c0d56f75b
43: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b1090d 25ca7 d7f3 b42c0d56f89a
44: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b10f99 26333 da49 b42c0d56faf0
45: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b11204 2659e db27 b42c0d56fbce
46: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b1217c 27516 e0ad b42c0d570154
47: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b1483f 29bd9 ee85 b42c0d570f2c
48: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b14ba6 29f40 efbc b42c0d571063
49: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b15569 2a903 f338 b42c0d5713df
50: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b250b3 3a44d 14cf8 b42c0d576d9f
51: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b254a0 3a83a 14e5f b42c0d576f06
52: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b25bd8 3af72 150f3 b42c0d57719a
53: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b25ec3 3b25d 151fd b42c0d5772a4
54: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b5fcab 75045 29cad b42c0d58bd54
55: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b6013b 754d5 29e4e b42c0d58bef5
56: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b6b86c 80c06 2dfbc b42c0d590063
57: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b6bc41 80fdb 2e11a b42c0d5901c1
58: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b6c4e5 8187f 2e430 b42c0d5904d7
59: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b6c776 81b10 2e51b b42c0d5905c2
60: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b7f97b 94d15 35266 b42c0d59730d
61: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b864af 9b849 378b0 b42c0d599957
62: ffff880001411c00 23132e49d b42c0d855884 c16e 231599c3a 26b79d dd3ec b42c0d932c70
63: ffff880001411c00 23132e49d b42c0d855884 c16e 231599dbc 26b91f dd476 b42c0d932cfa
64: ffff880001411c00 23132e49d b42c0d855884 c16e 231599f5f 26bac2 dd50c b42c0d932d90
65: ffff880001411c00 231fdf357 b42c0dcddc47 c176 232046a74 6771d 24f1e b42c0dd02b65
66: ffff880001411c00 231fdf357 b42c0dcddc47 c176 232046c53 678fc 24fca b42c0dd02c11
67: ffff880001411c00 231fdf357 b42c0dcddc47 c176 232046da0 67a49 25040 b42c0dd02c87
68: ffff880001411c00 232f4a54e b42c0e25f5e7 c17c 232f62a2d 184df 8ae2 b42c0e2680c9
69: ffff880001411c00 232f4a54e b42c0e25f5e7 c17c 232f63478 18f2a 8e8f b42c0e268476
70: ffff880001411c00 232f4a54e b42c0e25f5e7 c17c 232f63f61 19a13 9274 b42c0e26885b
71: ffff880001511c00 20afec946 b42bffe0b604 130 1f890681eacdf 1f88e5d1fe399 b433ab005565 1685faae10b69
Okay, I think I know what's going on and why Glauber's patch causes
problems for you. It looks like your kernel is reading the kvmclock on
the AP before it is initialized. Looking at the guest side of things,
it seems entirely plausible this could happen.
You did mention printk timing causes the bug to appear? Perhaps it is
not just coincidental. Printk getting the time might very well call
back into the timer code before the clock is initialized, and you've got
tons of stuff in cpu_init and friends that are likely to want to printk
all kinds of bootup messages.
If this were in fact the case, the cmpxchg that was added by Glauber's
patch could leap your clock forward to some very uninitialized random
value and then you could end up stuck in a timeout loop for days, as you
are seeing.
Can you try very simply disabling printk timing to see if that might be
the source of the bug? In the meantime, what kernel version do you have
in the guests?
Zach
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html