Re: kvm guest: hrtimer: interrupt too slow

Thomas Gleixner <tglx@xxxxxxxxxxxxx> · Thu, 8 Oct 2009 17:06:11 +0200 (CEST)

On Thu, 8 Oct 2009, Michael Tokarev wrote:
> Thomas Gleixner wrote:
> > 
> > I'm really missing the big picture here. 
> > What means "causes timers to be calculated on the "wrong" CPU etc" ?
> > And what do you consider a "scheduling mistake" ?
> 
> From the initial diagnostics by Marcelo:
> 
> > It seems the way hrtimer_interrupt_hanging calculates min_delta is
> > wrong (especially to virtual machines). The guest vcpu can be scheduled
> > out during the execution of the hrtimer callbacks (and the callbacks
> > themselves can do operations that translate to blocking operations in
> > the hypervisor).
> >
> > So high min_delta values can be calculated if, for example, a single
> > hrtimer_interrupt run takes two host time slices to execute, while some
> > other higher priority task runs for N slices in between.
> 
> From this I conclude that the huge min_delta is due to some other task(s)
> on the host being run while this guest is in hrtimer callback.  But I
> fail to see why that process on the host takes SO MUCH time, to warrant
> resulting min_delta to 0.5s, or to cause delays for 3..5 seconds in
> guest.  It's ok to have delays in range of several extra milliseconds,
> but for *seconds* is too much.
> 
> Note again that neither host nor guest are not under high load when
> this jump happens.  Also note that there's no high-priority processes
> running on the host, all are of the same priority level, including
> all the guests.
> 
> Note also that so far I only see it on SMP guests, never on UP
> guests.  And only on guests with kvm_clock, not with acpi_pm
> clocksource.
> 
> What I'm trying to say is that it looks like there's something
> else wrong here in the guest code.  Huge stalls, huge delays
> while in hrtimer callback (i think it jappens always when such
> delay is happening, it's just noticed by hrtimer code) -- that's
> the root cause of all this, (probably) wrong logic in hrtimer
> calibration just shows the results of something that's wrong
> elsewhere.

Ah, ok. That makes sense. The hrtimer interrupt hang check detects
that the CPU was "stolen" for whatever reasons. I'm wondering why this
happens several times in a row - it takes at least 4 iterations until
it decides to make the interval larger.

Thanks,

	tglx

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html