Re: 2.6.35-rc1 regression with pvclock and smp guests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/30/2010 02:34 PM, Arjan Koers wrote:
On 2010-07-28 12:37, Avi Kivity wrote:
  On 07/28/2010 12:00 AM, Arjan Koers wrote:
On 2010-07-26 20:59, Arjan Koers wrote:

I ran into the same problem. 2.6.34.1 and 2.6.35-rc6 SMP guest
kernels hang during boot.
It appears that last is way ahead of ret twice.
The kernel boots with this debug patch that makes the clock go
backwards if the difference is big:

       last = atomic64_read(&last_value);
       do {
-        if (ret<   last)
-            return last;
+        if (ret<   last) {
+            if ( last - ret<   25000000 )
+                return last;
+            else
+                printk("pvclock backwards: ret = %llx; last =
%llx\n", ret, last);
+        }
           last = atomic64_cmpxchg(&last_value, last, ret);
       } while (unlikely(last != ret));



[    0.037122] Total of 2 processors activated (11198.08 BogoMIPS).
[    0.037118] x86 PAT enabled: cpu 1, old 0x0, new 0x7010600070106
[    0.040000] pvclock backwards: ret = 108373705fe2; last = 210aff61470a
Zaaaacchhhh?!


The lists below show some debug data of the first 99 calls to
pvclock_clocksource_read since the kernel booted. The situation
after the 'do ... while (version != src->version)' loop is
displayed.

Meaning of the columns:
- src pointer
- shadow.tsc_timestamp
- shadow.system_timestamp
- shadow.version
- native_read_tsc()
- delta = native_read_tsc() - shadow.tsc_timestamp
- offset = scale_delta(delta, shadow.tsc_to_nsec_mul, shadow.tsc_shift)
- ret = shadow.system_timestamp + offset

Fields left out, because they were the same for all rows:
- shadow.tsc_to_nsec_mul: b6dc43b6
- shadow.tsc_shift: ffffffff
- shadow.flags: 0

Debug log of guest after cold boot of virtual machine:
  1: ffff880001411c00     2107d5a4e b42c01d704c6   8294     210d8d4b5        5b7a67        20abdc  b42c01f7b0a2
  2: ffff880001411c00     2107d5a4e b42c01d704c6   8294     210dc2b61        5ed113        21dd1b  b42c01f8e1e1
  3: ffff880001411c00     21cb0d4a8 b42c0632768f   bb70     21cb10a00          3558          130d  b42c0632899c
  4: ffff880001411c00     21cb0d4a8 b42c0632768f   bb70     21cb11f17          4a6f          1a95  b42c06329124
  5: ffff880001411c00     21cceaad2 b42c063d1e45   bbd8     21ccec522          1a50           965  b42c063d27aa
  6: ffff880001411c00     21cde0644 b42c06429a42   bc10     21ce25457         44e13         1899a  b42c064423dc
  7: ffff880001411c00     21cf905c1 b42c064c3e76   bc46     21cfa182b         1126a          6201  b42c064ca077
  8: ffff880001411c00     21d088194 b42c0651c601   bc7a     21d089592          13fe           723  b42c0651cd24
  9: ffff880001411c00     21d1ad073 b42c06584fc3   bcde     21d1b135d          42ea          17e5  b42c065867a8
10: ffff880001411c00     21d2a3837 b42c065dd039   bd10     21d2a4825           fee           5b0  b42c065dd5e9
11: ffff880001411c00     21d38bab3 b42c0662fea6   bd42     21d38caa1           fee           5b0  b42c06630456
12: ffff880001411c00     21d47459b b42c06683029   bd78     21d475517           f7c           587  b42c066835b0
13: ffff880001411c00     21d578ce7 b42c066e005f   bdb2     21d57d70c          4a25          1a7a  b42c066e1ad9
14: ffff880001411c00     21d578ce7 b42c066e005f   bdb2     21d57d8d6          4bef          1b1e  b42c066e1b7d
15: ffff880001411c00     21d578ce7 b42c066e005f   bdb2     21d57da22          4d3b          1b94  b42c066e1bf3
16: ffff880001411c00     21d578ce7 b42c066e005f   bdb2     21d57fc5e          6f77          27ce  b42c066e282d
17: ffff880001411c00     21d67c77c b42c0673cc0a   bde4     21d67d685           f09           55e  b42c0673d168
18: ffff880001411c00     21d7625b2 b42c0678ed96   be16     21d763488           ed6           54c  b42c0678f2e2
19: ffff880001411c00     21df3db36 b42c06a5d222   be54     21dfa78b9         69d83         25cd5  b42c06a82ef7
20: ffff880001411c00     21df3db36 b42c06a5d222   be54     21dfa7a3f         69f09         25d61  b42c06a82f83
21: ffff880001411c00     21df3db36 b42c06a5d222   be54     21dfa7f8b         6a455         25f45  b42c06a83167
22: ffff880001411c00     21e3a50ea b42c06befbb1   be58     21e3c1750         1c666          a249  b42c06bf9dfa
23: ffff880001411c00     21e4bfe47 b42c06c54bc5   be92     21e4c4c61          4e1a          1be4  b42c06c567a9
24: ffff880001411c00     21ea2a56e b42c06e43dbc   beca     21ea4b224         20cb6          bb66  b42c06e4f922
25: ffff880001411c00     21ea2a56e b42c06e43dbc   beca     21ea52748         281da          e53c  b42c06e522f8
26: ffff880001411c00     21ea2a56e b42c06e43dbc   beca     21ea52907         28399          e5db  b42c06e52397
27: ffff880001411c00     21ea2a56e b42c06e43dbc   beca     21ea52a76         28508          e65f  b42c06e5241b
28: ffff880001411c00     21ea2a56e b42c06e43dbc   beca     21ea5c86a         322fc         11ec9  b42c06e55c85
29: ffff880001411c00     21ea2a56e b42c06e43dbc   beca     21ea60e3a         368cc         137b7  b42c06e57573
30: ffff880001411c00     21ea2a56e b42c06e43dbc   beca     21ea64dc8         3a85a         14e6a  b42c06e58c26
31: ffff880001411c00     21ed8a003 b42c06f78496   bf02     21efda28b        250288         d37d2  b42c0704bc68
32: ffff880001411c00     21f0e9488 b42c070ac93f   bf38     21f0eacdb          1853           8af  b42c070ad1ee
33: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230aeeac6          3e60          1646  b42c0d5636ed
34: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230af06d0          5a6a          204a  b42c0d5640f1
35: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b03f25         192bf          8fd6  b42c0d56b07d
36: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b043c8         19762          917f  b42c0d56b226
37: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b0526b         1a605          96b8  b42c0d56b75f
38: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b05632         1a9cc          9812  b42c0d56b8b9
39: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b07eaa         1d244          a686  b42c0d56c72d
40: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b094e9         1e883          ae78  b42c0d56cf1f
41: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b09962         1ecfc          b011  b42c0d56d0b8
42: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b10590         2592a          d6b4  b42c0d56f75b
43: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b1090d         25ca7          d7f3  b42c0d56f89a
44: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b10f99         26333          da49  b42c0d56faf0
45: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b11204         2659e          db27  b42c0d56fbce
46: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b1217c         27516          e0ad  b42c0d570154
47: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b1483f         29bd9          ee85  b42c0d570f2c
48: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b14ba6         29f40          efbc  b42c0d571063
49: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b15569         2a903          f338  b42c0d5713df
50: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b250b3         3a44d         14cf8  b42c0d576d9f
51: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b254a0         3a83a         14e5f  b42c0d576f06
52: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b25bd8         3af72         150f3  b42c0d57719a
53: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b25ec3         3b25d         151fd  b42c0d5772a4
54: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b5fcab         75045         29cad  b42c0d58bd54
55: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b6013b         754d5         29e4e  b42c0d58bef5
56: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b6b86c         80c06         2dfbc  b42c0d590063
57: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b6bc41         80fdb         2e11a  b42c0d5901c1
58: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b6c4e5         8187f         2e430  b42c0d5904d7
59: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b6c776         81b10         2e51b  b42c0d5905c2
60: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b7f97b         94d15         35266  b42c0d59730d
61: ffff880001411c00     230aeac66 b42c0d5620a7   c100     230b864af         9b849         378b0  b42c0d599957
62: ffff880001411c00     23132e49d b42c0d855884   c16e     231599c3a        26b79d         dd3ec  b42c0d932c70
63: ffff880001411c00     23132e49d b42c0d855884   c16e     231599dbc        26b91f         dd476  b42c0d932cfa
64: ffff880001411c00     23132e49d b42c0d855884   c16e     231599f5f        26bac2         dd50c  b42c0d932d90
65: ffff880001411c00     231fdf357 b42c0dcddc47   c176     232046a74         6771d         24f1e  b42c0dd02b65
66: ffff880001411c00     231fdf357 b42c0dcddc47   c176     232046c53         678fc         24fca  b42c0dd02c11
67: ffff880001411c00     231fdf357 b42c0dcddc47   c176     232046da0         67a49         25040  b42c0dd02c87
68: ffff880001411c00     232f4a54e b42c0e25f5e7   c17c     232f62a2d         184df          8ae2  b42c0e2680c9
69: ffff880001411c00     232f4a54e b42c0e25f5e7   c17c     232f63478         18f2a          8e8f  b42c0e268476
70: ffff880001411c00     232f4a54e b42c0e25f5e7   c17c     232f63f61         19a13          9274  b42c0e26885b
71: ffff880001511c00     20afec946 b42bffe0b604    130 1f890681eacdf 1f88e5d1fe399  b433ab005565 1685faae10b69

Okay, I think I know what's going on and why Glauber's patch causes problems for you. It looks like your kernel is reading the kvmclock on the AP before it is initialized. Looking at the guest side of things, it seems entirely plausible this could happen.

You did mention printk timing causes the bug to appear? Perhaps it is not just coincidental. Printk getting the time might very well call back into the timer code before the clock is initialized, and you've got tons of stuff in cpu_init and friends that are likely to want to printk all kinds of bootup messages.

If this were in fact the case, the cmpxchg that was added by Glauber's patch could leap your clock forward to some very uninitialized random value and then you could end up stuck in a timeout loop for days, as you are seeing.

Can you try very simply disabling printk timing to see if that might be the source of the bug? In the meantime, what kernel version do you have in the guests?

Zach
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux