On Thu, 8 Oct 2009, Michael Tokarev wrote: > Thomas Gleixner wrote: > > > > I'm really missing the big picture here. > > What means "causes timers to be calculated on the "wrong" CPU etc" ? > > And what do you consider a "scheduling mistake" ? > > From the initial diagnostics by Marcelo: > > > It seems the way hrtimer_interrupt_hanging calculates min_delta is > > wrong (especially to virtual machines). The guest vcpu can be scheduled > > out during the execution of the hrtimer callbacks (and the callbacks > > themselves can do operations that translate to blocking operations in > > the hypervisor). > > > > So high min_delta values can be calculated if, for example, a single > > hrtimer_interrupt run takes two host time slices to execute, while some > > other higher priority task runs for N slices in between. > > From this I conclude that the huge min_delta is due to some other task(s) > on the host being run while this guest is in hrtimer callback. But I > fail to see why that process on the host takes SO MUCH time, to warrant > resulting min_delta to 0.5s, or to cause delays for 3..5 seconds in > guest. It's ok to have delays in range of several extra milliseconds, > but for *seconds* is too much. > > Note again that neither host nor guest are not under high load when > this jump happens. Also note that there's no high-priority processes > running on the host, all are of the same priority level, including > all the guests. > > Note also that so far I only see it on SMP guests, never on UP > guests. And only on guests with kvm_clock, not with acpi_pm > clocksource. > > What I'm trying to say is that it looks like there's something > else wrong here in the guest code. Huge stalls, huge delays > while in hrtimer callback (i think it jappens always when such > delay is happening, it's just noticed by hrtimer code) -- that's > the root cause of all this, (probably) wrong logic in hrtimer > calibration just shows the results of something that's wrong > elsewhere. Ah, ok. That makes sense. The hrtimer interrupt hang check detects that the CPU was "stolen" for whatever reasons. I'm wondering why this happens several times in a row - it takes at least 4 iterations until it decides to make the interval larger. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html