Thomas Gleixner wrote:
[]
Also it's not clear to me why the problem does only happen with
kvm_clock and not with acpi_pm timer emulation (according to the
reporter) and is restricted to SMP guests.
I just reproduced it with acpi_pm. I explained it already to Marcelo,
the problem is that the issue is difficult to trigger. I still don't
have any pointers as of how to trigger it, all my attempts so far, to
create network, disk or cpu load, failed. So the only way is to run
the guest and wait, in a hope it'll be there. And I restarted the
"guinea pig" guest today (which happens to be our main office server :),
and voila, after ~4 hours uptime said the same thing about hrtimer.
That's lucky time, since it may run stable for several days...
It just happens (and I mentioned it each time) that I didn't *see* the
issue with acpi_pm. Now I see it with acpi_pm too.
Speaking of smp -- well, that one is of the same category. Maybe smp
just makes the issue easier to trigger but it exists with UP guests too,
maybe it's SMP-specific - I don't know. What I know for sure is that
out of 4 guests here (running on the same host), 2 are SMP and 2 UP,
loaded approximately equally (according to the number of CPUs), and
two SMP guests shows the issue quite easily, while for 2 UP guests
I don't see anything in the logs for last 2 months.
The issue isn't unique to my machines, other people reported it too
in #kvm, including at least one active participant there. For him,
issues stopped when he switched from SMP to UP guest. Yet there's
no definite knowlege if the issue is really SMP-specific or not.
retry:
/* 5 retries is enough to notice a hang */
- if (!(++nr_retries % 5))
- hrtimer_interrupt_hanging(dev, ktime_sub(ktime_get(), now));
+ if (!(++nr_retries % 5)) {
+ ktime_t try_time = ktime_sub(ktime_get(), now);
+
+ do {
+ for (i = 0; i < 3; i++)
+ expires_next = ktime_add(expires_next,try_time);
+ } while (tick_program_event(expires_next, 0));
This needs at least a WARN_ON_ONCE() or some other way (sysfs, proc,
...) where we can find out how often this happens.
Definitely. Or printk_ratelimit.
Before Marcelo come with his first version I was thinking about exposing
that min_delta over procfs to be able to reset it back to a reasonable
value.. ;)
Thanks!
/mjt
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html