Re: smp guest questions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Replying to myself & top-posting for reference.

I can't reproduce the problem - neither of the
two issues with timers mentioned in my original
email quited below.

But there IS a race somewhere, that's for sure.

When I saw both - "pm-timer running at 200% rate"
and "hrtimer: interrupt too slow" (and I saw them
more than once on this configuration), - it was
during host system startup, when it starts all
the guest machines (several of them) and they
continue its own startup at the background, all
at once.  I.e, it happened more than once when
several kvm guests gets started all together.

Playing with it more I wasn't able to repeat the
issue, and can't trigger it with 4 guests on my
test machine at home either.  But it happened
again "when I wasn't watching", also during
massive guest startup.

Another issue happened during startup (or, rather,
AFTER such massive startup when one guest reported
the 200% rate of pm-timer, probably at the same time
when hrtimer message popped up) - another guest
locked up hard, kvm process were looping using 100%
cpu time and did not answer to monitor socket requests
(it was supposed to listen on a unix socket for monitor
commands).  *Probably* at the time when one guest were
in locked state, another guest reported that hrtimer
message - but I'm not 100% sure since I can only see
it by "--MARK--" messages in syslog of the died guest,
which are at 20-minute intervals.  Maybe some "random
glitch", I dunno ;)

In any way, since I can't provide more information
about all this despite all my attempts to reproduce
the situation.. I consider this issue closed, for now
anyway.  But let it be archived for future refefence :)

Thanks!

/mjt

Michael Tokarev wrote:
Avi Kivity wrote:
On 06/17/2009 11:38 AM, Michael Tokarev wrote:
After seeing words from Avi about that smp guests
are ok now, I descided to try.  And immediately
got a few questions.

Running on a Phenom 9750 machine (PhenomI), AMD780G
chipset.  Host is 2.6.29 x86-64, qemu-kvm 0.10.5,
guests are linux with kvm paravirt bits enabled, also
dynticks (on both host and guest).


When booting a 2-CPU guest, I see in dmesg:

PM-Timer running at invalid rate: 200% of normal - aborting.

and indeed, in available_clocksource there's no pmtimer.
Should I be concerned?  It does not look healthy.


It's a bug, please post guest details (kernel version, bitness).

The guest kernel is also 2.6.29[.5], but this time it's x86-32
(compiled for P4).  kvm userspace is also 32bits (historical) --
only host kernel is 64bit for now.  I'll try to do some more
experiments later today on a test machine (this is a production
box) -- "hopefully" that same issue will occur on another
machine :)

Copying Marcelo.


Some time later, I see stuff like:

hrtimer: interrupt too slow, forcing clock min delta to 47210997 ns

Which reminds me issues I had with broken hpet (time goes
back-n-forth with similar messages shown in dmesg, but
about hpet not hrtimer).  Also does not look healthy.


I haven't seen either of the two messages above on any of
single-processor guests so far, at least with recent kernels
and kvm userspace, only on smp (2 cpu for now).

Please also post host /proc/cpuifo.

HOST cpuinfo (only for 4th core, other cores are similar):
processor    : 3
vendor_id    : AuthenticAMD
cpu family    : 16
model        : 2
model name    : AMD Phenom(tm) 9750 Quad-Core Processor
stepping    : 3
cpu MHz        : 1200.000
(yes ondemand cpufreq is in effect - nominal frequency is 2400.
I had no issues with cpufreq on this box so far, including all
the guests).
cache size    : 512 KB
physical id    : 0
siblings    : 4
core id        : 3
cpu cores    : 4
apicid        : 3
initial apicid    : 3
fpu        : yes
fpu_exception    : yes
cpuid level    : 5
wp        : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc pni monitor cx16 lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs
bogomips    : 4812.67
TLB size    : 1024 4K pages
clflush size    : 64
cache_alignment    : 64
address sizes    : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate



cpuinfo on GUEST (also for only one CPU):

processor    : 1
vendor_id    : AuthenticAMD
cpu family    : 6
model        : 2
model name    : QEMU Virtual CPU version 0.10.5
stepping    : 3
cpu MHz        : 2405.894
cache size    : 512 KB
fdiv_bug    : no
hlt_bug        : no
f00f_bug    : no
coma_bug    : no
fpu        : yes
fpu_exception    : yes
cpuid level    : 2
wp        : yes
flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall lm pni hypervisor
bogomips    : 4811.78
clflush size    : 64
power management:


Thanks!

/mjt
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux