Bugs item #2351676, was opened at 2008-11-26 20:59 Message generated for change (Comment added) made by mjtsf You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2351676&group_id=180599 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Chris Jones (c_jones) Assigned to: Nobody/Anonymous (nobody) Summary: Guests hang periodically on Ubuntu-8.10 Initial Comment: I'm seeing periodic hangs on my guests. I've been unable so far to find a trigger - they always boot fine, but after anywhere from 10 minutes to 24 hours they eventually hang completely. My setup: * AMD Athlon X2 4850e (2500 MHz dual core) * 4Gig memory * Ubuntu 8.10 server, 64-bit * KVMs tried: : kvm-72 (shipped with ubuntu) : kvm-79 (built myself, --patched-kernel option) * Kernels tried: : 2.6.27.7 (kernel.org, self built) : 2.6.27-7-server from Ubuntu 8.10 distribution In guests * Ubuntu 8.10 server, 64-bit (virtual machine install) * kernel 2.6.27-7-server from Ubuntu 8.10 I'm running the guests like: sudo /usr/local/bin/qemu-system-x86_64 \ -daemonize \ -no-kvm-irqchip \ -hda Imgs/ndev_root.img \ -m 1024 \ -cdrom ISOs/ubuntu-8.10-server-amd64.iso \ -vnc :4 \ -net nic,macaddr=DE:AD:BE:EF:04:04,model=e1000 \ -net tap,ifname=tap4,script=/home/chris/kvm/qemu-ifup.sh The problem does not happen if I use -no-kvm. I've tried some other options that have no effect: -no-kvm-pit -no-acpi The disk images are raw format. When the guests hang, I cannot ping them, and the vnc console us hung. The qemu monitor is still accessible, and the guests recover if I issue a system_reset command from the monitor. However, often, the console will not take keyboard after doing so. When the guest is hung, kvm_stat shows all 0s for the counters: efer_relo exits fpu_reloa halt_exit halt_wake host_stat hypercall +insn_emul insn_emul invlpg io_exits irq_exits irq_windo largepage +mmio_exit mmu_cache mmu_flood mmu_pde_z mmu_pte_u mmu_pte_w mmu_recyc +mmu_shado nmi_windo pf_fixed pf_guest remote_tl request_i signal_ex +tlb_flush > 0 0 0 0 0 0 0 +0 0 0 0 0 0 0 0 +0 0 0 0 0 0 0 0 +0 0 0 0 0 0 gdb shows two threads - both waiting: c(gdb) info threads 2 Thread 0x414f1950 (LWP 422) 0x00007f36f07a03e1 in sigtimedwait () from /lib/libc.so.6 1 Thread 0x7f36f1f306e0 (LWP 414) 0x00007f36f084b482 in select () from /lib/libc.so.6 (gdb) thread 1 [Switching to thread 1 (Thread 0x7f36f1f306e0 (LWP 414))]#0 0x00007f36f084b482 +in select () from /lib/libc.so.6 (gdb) bt #0 0x00007f36f084b482 in select () from /lib/libc.so.6 #1 0x00000000004094cb in main_loop_wait (timeout=0) at /home/chris/pkgs/kvm/kvm-79/qemu/vl.c:4719 #2 0x000000000050a7ea in kvm_main_loop () at /home/chris/pkgs/kvm/kvm-79/qemu/qemu-kvm.c:619 #3 0x000000000040fafc in main (argc=<value optimized out>, argv=0x7ffff9f41948) at /home/chris/pkgs/kvm/kvm-79/qemu/vl.c:4871 (gdb) thread 2 [Switching to thread 2 (Thread 0x414f1950 (LWP 422))]#0 0x00007f36f07a03e1 in +sigtimedwait () from /lib/libc.so.6 (gdb) bt #0 0x00007f36f07a03e1 in sigtimedwait () from /lib/libc.so.6 #1 0x000000000050a560 in kvm_main_loop_wait (env=0xc319e0, timeout=0) at /home/chris/pkgs/kvm/kvm-79/qemu/qemu-kvm.c:284 #2 0x000000000050aaf7 in ap_main_loop (_env=<value optimized out>) at /home/chris/pkgs/kvm/kvm-79/qemu/qemu-kvm.c:425 #3 0x00007f36f11ba3ea in start_thread () from /lib/libpthread.so.0 #4 0x00007f36f0852c6d in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Any clues to help me resolve this would be much appreciated. ---------------------------------------------------------------------- Comment By: Michael Tokarev (mjtsf) Date: 2009-02-09 16:52 Message: Ok, I have very similar issue here as well. Host - 4-core Phenom CPU and AMD 780G chipset, running 2.6.28.4-x86-64 (from kernel.org). kvm-83 32bits Guest - 2.6.27.13-i686smp, also from kernel.org. The guest is running with KVM_GUEST stuff enabled, using kvm timer and virtio network and block. The system is Debian (lenny-to-be) on both, but I don't think it matters since both uses custom-compiled kernels. Guest - at least one of them - hangs, especially when many guests are running in parallel (we've 4 windows machines and 4 linux machines, mostly idle). When it hangs, nothing really works - console, ping, etc. It usually continues working after 1..2 minutes or more. During the hang, the host is either silent or is spewing tons of "vcpu not ready for apic_round_robin" messages (several 1000s of them) -- but I can't be sure that message is directly related to the hangs. Nothing is logged on guest. The so-far-only-affected guest is assigned 2 virtual CPUs, -- I'll try to reboot it with single cpu only to see if it will change anything. I wasn't able to check gdb/trace/etc so far, because the guest that hangs is my main working machine, which is a terminal server, so I have to run to another room to server's console and check there. ---------------------------------------------------------------------- Comment By: Dustin Kirkland (dustin_kirkland) Date: 2009-02-09 15:38 Message: In the Ubuntu 8.10 guest, can you try the linux-image-virtual kernel? The current one points to linux-image-2.6.27-11-virtual. :-Dustin ---------------------------------------------------------------------- Comment By: Daniel Poelzleithner (poelzi) Date: 2009-01-18 09:18 Message: New stability infos on my side. Host: Linux dirus-dom 2.6.28-2-server #3-Ubuntu SMP Thu Dec 4 22:35:12 UTC 2008 x86_64 GNU/Linux Guest: 2.6.28 x86_64 - disabled all kvm guest options (with kvm_clock disabled) - enabled virtio_block - started with -smp 1 and -smp 2 they didn't crash yet, with 1 or 2 smp. I think disabling kvm guest support did the trick. however using nfs out of the guest is quite slow and not very stable it seems. the guest laggs quite often i have the feeling but even loads up to 11. running crashme, high -j kernel build and file transfers didn't crash the machine. ---------------------------------------------------------------------- Comment By: James Thomason (james_thomason) Date: 2009-01-15 10:30 Message: Update: I installed Ubuntu 8.10 server and upgraded to 2.6.29-rc1 and KVM-83. I am still able to reproduce when kvm -smp > 1. New behavior in this configuration is the printing of the message "Stuck??" to the console, followed shortly by a kernel panic. KVM Host: Ubuntu Server 8.10 Linux 2.6.29-RC1 KVM-83 KVM Guest: Ubuntu Server 8.10 2.6.27-9-server ---------------------------------------------------------------------- Comment By: James Thomason (james_thomason) Date: 2009-01-15 10:20 Message: Hello, I am able to reliably reproduce a condition where a guest goes into a tight loop or spinlock on all running cores. The scenario is exactly as described in bug 2351676, though my environment differs as detailed below. My observation is that the issue is correlated to the number of VCPUs assigned to the guest and CPU load. The higher the number of VCPUs and CPU utilization, the more easily it is triggered. If a KVM developer is interested in debugging live, I might be able to arrange getting the system in question into a DMZ. A review of the kvm tracker leads me to believe that the following bugs are possibly related: [ 2351676 ] Guests hang periodically on Ubuntu-8.10 [ 2353811 ] Solaris 10 guest unstable [ 2494730 ] Guests "stalling" on kvm-82 [ 2138079 ] kvm locks up system [ 2113643 ] guests AND host still getting stuck under CPU load KVM Host Configuration: 4 x Quad-Core AMD Opteron Processors (8346 HE @ 1.8Ghz) 64GB DDR2 667Mhz Fedora 10 x64 Kernel 2.6.28 KVM-82 KVM Guest Configuration: 32GB Memory 1 to 16 VCPUs Centos 5.2 x64 Kernel 2.6.28 IDE disk e1000 NIC ---------------------------------------------------------------------- Comment By: Daniel Poelzleithner (poelzi) Date: 2009-01-13 22:11 Message: I have a very simelar setup. Host: Ubuntu 8.10. Kernel 2.6.28-2-server KVM: 72, 80, 81, 82, 83 tried (using the up to date kvm module, too) Guests: Endian Firewall (centos based.) Kernel 2.6.22.19-72.endian15 Is stable so far. sometimes loos usb devices Ubuntu 8.10 Kernel 2.6.27, 2.6.28-2-server, 2.6.28 vanilla home brew Very unstable. As the Ubuntu 8.10 is also unstable when using the 2.6.28 vanilla kernel, i'm not so sure it's a guest problem. I will now compile a 2.6.28 kernel not having any kvm guest support. Things doesn't seem to have a affect: - using ide instead of virtio - using e1000 instead of virtio however, it seems that it may be caused by io access, but is not reproducable easily. Last tries i did': using kernel parameters "clocksource=acpi_pm notsc" in the guest. Still investigating if it makes the guest stable. btw. with kvm-82 i saw arround 100 io_exits when only the crashed ubuntu 8.10 is running. nothing else. ---------------------------------------------------------------------- Comment By: Chris Jones (c_jones) Date: 2008-12-10 23:29 Message: Actually, I was too quick to say that a Fedora 8 guest is stable. Even there, I'm seeing hangs once I get my application fully installed (basically, once I introduce some load). I also did an update to kvm-80 and the problem still exists (on all the guests I've tried). That's with kvm-80 kernel modules and the kvm-80 user, running on linux-2.6.27.8. Thanks, Chris ---------------------------------------------------------------------- Comment By: Chris Jones (c_jones) Date: 2008-12-01 22:09 Message: Alexey, Thanks for the response. As you advised, I tried a Fedora 8 guest, and it does seem to be much more stable. However, I really need a Debian base system for my application. Not necessarily Ubuntu 8.10, but I haven't had much luck with others either. Do you have any recommendations on one that is particularly stable? Over the weekend I tried: Fedora 8 : Seems very stable, but I really need a debian base. Ubuntu 8.04LTS : Same periodic hangs I was seeing on 8.10 Debian 4.0 Etch: Seems stable on the guest, but on the host, qemu process is running 100% busy while the guest is idle. Any chance you know a workaround for the issue I'm seeing on etch, or can recommend a Debian base distribution which works well with KVM? Thanks much, Chris ---------------------------------------------------------------------- Comment By: Technologov (technologov) Date: 2008-11-27 15:54 Message: In my opinion it is not the Ubuntu host that is problematic - but the guest on KVM. I mean that Ubuntu 8.10 guest is unstable on KVM. I have not found out why. If you try some better tested guest (Fedora 7/8 or Windows XP guest it should be lots more stable). And if you try some other host (i.e. Fedora host and run Ubuntu 8.10 guest it will be unstable). In short - in my opinion - the problem is not host OS, but either KVM or it's connection with guest OS. -Alexey E. "Technologov", 27.11.2008. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2351676&group_id=180599 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html