On 2010-07-31 04:39, Zachary Amsden wrote: > On 07/30/2010 02:34 PM, Arjan Koers wrote: >> On 2010-07-28 12:37, Avi Kivity wrote: >> >>> On 07/28/2010 12:00 AM, Arjan Koers wrote: >>> >>>> On 2010-07-26 20:59, Arjan Koers wrote: >>>> >>>> >>>>> I ran into the same problem. 2.6.34.1 and 2.6.35-rc6 SMP guest >>>>> kernels hang during boot. >>>>> >>>> It appears that last is way ahead of ret twice. >>>> The kernel boots with this debug patch that makes the clock go >>>> backwards if the difference is big: >>>> >>>> last = atomic64_read(&last_value); >>>> do { >>>> - if (ret< last) >>>> - return last; >>>> + if (ret< last) { >>>> + if ( last - ret< 25000000 ) >>>> + return last; >>>> + else >>>> + printk("pvclock backwards: ret = %llx; last = >>>> %llx\n", ret, last); >>>> + } >>>> last = atomic64_cmpxchg(&last_value, last, ret); >>>> } while (unlikely(last != ret)); >>>> >>>> >>>> >>>> [ 0.037122] Total of 2 processors activated (11198.08 BogoMIPS). >>>> [ 0.037118] x86 PAT enabled: cpu 1, old 0x0, new 0x7010600070106 >>>> [ 0.040000] pvclock backwards: ret = 108373705fe2; last = >>>> 210aff61470a >>>> >>> Zaaaacchhhh?! >>> >>> >> >> The lists below show some debug data of the first 99 calls to >> pvclock_clocksource_read since the kernel booted. The situation >> after the 'do ... while (version != src->version)' loop is >> displayed. >> >> Meaning of the columns: >> - src pointer >> - shadow.tsc_timestamp >> - shadow.system_timestamp >> - shadow.version >> - native_read_tsc() >> - delta = native_read_tsc() - shadow.tsc_timestamp >> - offset = scale_delta(delta, shadow.tsc_to_nsec_mul, shadow.tsc_shift) >> - ret = shadow.system_timestamp + offset >> >> Fields left out, because they were the same for all rows: >> - shadow.tsc_to_nsec_mul: b6dc43b6 >> - shadow.tsc_shift: ffffffff >> - shadow.flags: 0 >> >> Debug log of guest after cold boot of virtual machine: <snip> >> 70: ffff880001411c00 232f4a54e b42c0e25f5e7 c17c >> 232f63f61 19a13 9274 b42c0e26885b >> 71: ffff880001511c00 20afec946 b42bffe0b604 130 1f890681eacdf >> 1f88e5d1fe399 b433ab005565 1685faae10b69 >> > > Okay, I think I know what's going on and why Glauber's patch causes > problems for you. It looks like your kernel is reading the kvmclock on > the AP before it is initialized. Looking at the guest side of things, > it seems entirely plausible this could happen. > > You did mention printk timing causes the bug to appear? Perhaps it is > not just coincidental. Printk getting the time might very well call > back into the timer code before the clock is initialized, and you've got > tons of stuff in cpu_init and friends that are likely to want to printk > all kinds of bootup messages. > > If this were in fact the case, the cmpxchg that was added by Glauber's > patch could leap your clock forward to some very uninitialized random > value and then you could end up stuck in a timeout loop for days, as you > are seeing. Yes. That large wrong value is stored in last_value and all future correct values are ignored, because they are smaller then last_value. > Can you try very simply disabling printk timing to see if that might be > the source of the bug? In the meantime, what kernel version do you have > in the guests? The kernel boots successfully when CONFIG_PRINTK_TIME is not set. I'm testing with 2.6.35-rc6 now. The problem also occurs with 2.6.34.1, which also has Glauber's patch. Version 2.6.34 is working. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html