On Sun, Oct 04, 2009 at 04:01:02PM +0400, Michael Tokarev wrote: > Marcelo Tosatti wrote: >> Michael, >> >> Can you please give the patch below a try please? (without acpi_pm >> timer or priority adjustments for the guest). > > Sure. I'll try it out in a hour or two, while I can experiment freely because > it's weekend. > > But I wonder... > [] >>> hrtimer: interrupt too slow, forcing clock min delta to 93629025 ns >> >> It seems the way hrtimer_interrupt_hanging calculates min_delta is >> wrong (especially to virtual machines). The guest vcpu can be scheduled >> out during the execution of the hrtimer callbacks (and the callbacks >> themselves can do operations that translate to blocking operations in >> the hypervisor). >> >> So high min_delta values can be calculated if, for example, a single >> hrtimer_interrupt run takes two host time slices to execute, while some >> other higher priority task runs for N slices in between. >> >> Using the hrtimer_interrupt execution time (which can be the worse >> case at any given time), as the min_delta is problematic. >> >> So simply increase min_delta_ns by 50% once every detected failure, >> which will eventually lead to an acceptable threshold (the algorithm >> should scale back to down lower min_delta, to adjust back to wealthier >> times, too). > > ..I wonder what should I check for. I mean, the end result of this patch > is not entirely clear to me, what should it change? I see that instead > of the now-calculated-after-error (probably very large) min_delta, it's > increased to a smaller value. > > So I should be getting more such messages (forcing min_delta to $foo), but > the "responsiveness" of the guest should stay in more or less acceptable > range (unless it will continue erroring, in which case the "responsiveness" > will be slowly reduced). Right. > Yes indeed, it's better than current situation, when the guest works fine > initially but out of the sudden it switches to a wild very-slow-to-reply > mode. But it does not look like a right solution either, even if the > back adjustment (mentioned in the last statement above) will be implemented. > Unless I completely missed the point... > > Neverless, the question stands: what I'm looking for now, when the patch is > applied? I can't measure the "responsiveness", especially since the min_delta > gets set to different (large) values each time (I mean current situation > without the patch). You should see min_delta_ns increase to a much smaller value, hopefully in the 2000-10000 range. min_delta_ns is the minimum delay a high resolution timer can have. You had it set in the hundreds of milliseconds range. > > Thanks! > > /mjt > >> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c >> index 49da79a..8997978 100644 >> --- a/kernel/hrtimer.c >> +++ b/kernel/hrtimer.c >> @@ -1234,28 +1234,20 @@ static void __run_hrtimer(struct hrtimer *timer) >> #ifdef CONFIG_HIGH_RES_TIMERS >> -static int force_clock_reprogram; >> - >> /* >> * After 5 iteration's attempts, we consider that hrtimer_interrupt() >> * is hanging, which could happen with something that slows the interrupt >> - * such as the tracing. Then we force the clock reprogramming for each future >> - * hrtimer interrupts to avoid infinite loops and use the min_delta_ns >> - * threshold that we will overwrite. >> - * The next tick event will be scheduled to 3 times we currently spend on >> - * hrtimer_interrupt(). This gives a good compromise, the cpus will spend >> - * 1/4 of their time to process the hrtimer interrupts. This is enough to >> - * let it running without serious starvation. >> + * such as the tracing, so we increase min_delta_ns. >> */ >> static inline void >> -hrtimer_interrupt_hanging(struct clock_event_device *dev, >> - ktime_t try_time) >> +hrtimer_interrupt_hanging(struct clock_event_device *dev) >> { >> - force_clock_reprogram = 1; >> - dev->min_delta_ns = (unsigned long)try_time.tv64 * 3; >> - printk(KERN_WARNING "hrtimer: interrupt too slow, " >> - "forcing clock min delta to %lu ns\n", dev->min_delta_ns); >> + dev->min_delta_ns += dev->min_delta_ns >> 1; >> + if (printk_ratelimit()) >> + printk(KERN_WARNING "hrtimer: interrupt too slow, " >> + "forcing clock min delta to %lu ns\n", >> + dev->min_delta_ns); >> } >> /* >> * High resolution timer interrupt >> @@ -1276,7 +1268,7 @@ void hrtimer_interrupt(struct clock_event_device *dev) >> retry: >> /* 5 retries is enough to notice a hang */ >> if (!(++nr_retries % 5)) >> - hrtimer_interrupt_hanging(dev, ktime_sub(ktime_get(), now)); >> + hrtimer_interrupt_hanging(dev); >> now = ktime_get(); >> @@ -1342,7 +1334,7 @@ void hrtimer_interrupt(struct clock_event_device >> *dev) >> /* Reprogramming necessary ? */ >> if (expires_next.tv64 != KTIME_MAX) { >> - if (tick_program_event(expires_next, force_clock_reprogram)) >> + if (tick_program_event(expires_next, 0)) >> goto retry; >> } >> } -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html