On 06/02/2011 11:25 AM, Marc Haber wrote:
Hi, I have just started deploying a host doing virtualization with KVM. The box has an Athlon 64 X2, 4 GB RAM and is running Debian squeeze with a locally built 2.6,39 kernel and backported versions of qemu-kvm (0.14.0) and libvirt (0.9.0) from Debian sid. The box is currently hosting five VMs, all of them Debian systems as well and rather unloaded. The only time when there is significant load is when all VMs are simultaneously starting up their cron jobs. When the host starts up, it immediately spews the following lines to the console: kvm: 2865: cpu0 unhandled rdmsr: 0xc0010048 kvm: 2865: cpu0 unhandled wrmsr: 0xc0010048 data 2100000401 kvm: 2865: cpu0 unhandled rdmsr: 0xc0010001 kvm: 2849: cpu0 unhandled rdmsr: 0xc0010048 kvm: 2849: cpu0 unhandled wrmsr: 0xc0010048 data c0579f7cc0010448 kvm: 2849: cpu0 unhandled rdmsr: 0xc0010001 kvm: 2950: cpu0 unhandled rdmsr: 0xc0010048 kvm: 2950: cpu0 unhandled wrmsr: 0xc0010048 data c0579f7cc0010448 kvm: 2849: cpu1 unhandled rdmsr: 0xc0010048 kvm: 2963: cpu0 unhandled rdmsr: 0xc0010112 kvm: 2963: cpu0 unhandled rdmsr: 0xc0010048 kvm: 2963: cpu0 unhandled wrmsr: 0xc0010048 data 2100000401 kvm: 2963: cpu0 unhandled rdmsr: 0xc0010001 kvm: 2963: cpu1 unhandled rdmsr: 0xc0010048 kvm: 2963: cpu1 unhandled wrmsr: 0xc0010048 data 2100000401 Every few days, the system stops dead in its tracks and needs a hard reset to be revived. I have a serial console, which unfortunately disconnects me after a few minutes of inactivity, and only caches the last few lines of activity. Whenever I connect to the serial console of the frozen system, I have a few lines of the same "unhandled (rd|wr)msr" messages. The syslog doesn't show anything strange. The system just stops dead in its tracks. Is there any possibility that the freezes have to do with the "unhandles (rd|wr)msr" messages?
Very unlikely.
When else could be the cause? In the mean time, I have taken the box offline and am running memtest. Up to now, everything seems to be fine. Any hints will be appreciated.
You might try setting up netconsole to get reliable logging. Do you have NMIs? 'grep NMI /proc/interrupts'. Does running 'perf top -F 10000' make the hang come sooner? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html