Bugs item #2042889, was opened at 2008-08-08 13:16 Message generated for change (Settings changed) made by jessorensen You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2042889&group_id=180599 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Closed >Resolution: Duplicate Priority: 5 Private: No Submitted By: Rafal Wijata (ravpl) Assigned to: Nobody/Anonymous (nobody) Summary: guest: device offline, then kernel panic Initial Comment: host: kvm71, 64bit 2.6.18-92.1.6.el5, 16Gram, 2*X5450(8cores) guest: 64bit 2.6.18-92.1.6.el5, 3.5Gram, 2cpus, 5hdds on raw partitions(!). In the guest, i'm getting quite often messages like kernel: sd 0:0:0:0: ABORT operation started. kernel: sd 0:0:0:0: ABORT operation timed-out. [many times like that] [there was more messages concerning the device is offline, but I lost them, will update if it happens again] then filesystem gets remounted read-only, then kernel panics with message(part of the message only, that's what i got on the screen) FS: 0000000000000000(0000) GS:ffffffff8039f000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000400000013 CR3: 0000000000201000 CR4: 00000000000006e0 Process sshd (pid: 23911, threadinfo ffff81006f53a000, task ffff8100dc2ca0c0) Stack: ffffffff800075dc ffff8100dc1ba960 ffff8100dc1ba688 ffff810096b52300 ffff8100dd15acc0 ffff8100dc1ba758 ffff8100dc1ba758 ffff810003f2a680 ffffffff8000d11c 0000000000000008 0000000000000008 ffff8100dd15acc0 Call Trace: [<ffffffff800075dc>] kmem_cache_free+0x13c/0x1dd [<ffffffff8000d11c>] dput+0xf6/0x114 [<ffffffff800125f3>] __fput+0x16c/0x198 [<ffffffff8001a6a7>] remove_vma+0x3d/0x64 [<ffffffff80039c60>] exit_mmap+0xcf/0xf3 [<ffffffff8003bd73>] mmput+0x30/0x83 [<ffffffff800151b6>] do_exit+0x28b/0x8d0 [<ffffffff80048a1c>] cpuset_exit+0x0/0x6c [<ffffffff8005d28d>] tracesys+0xd5/0xe0 Code: f0 ff 0f 0f 88 6c 01 00 00 c3 f0 81 2f 00 00 00 01 74 05 e8 RIP [<ffffffff80064a2d>] _spin_lock+0x0/0xa RSP <ffff81006f53be10> CR2: 0000000400000013 <0>Kernel panic - not syncing: Fatal exception Even though the kernel panic, the kvm process was still taking 100% CPU. gdb shows following info - no clue though if it's helpful in any way. Thread 4 (Thread 1938626880 (LWP 17006)): #0 0x000000368bec6fa7 in ioctl () from /lib64/libc.so.6 #1 0x000000000050f726 in kvm_run (kvm=0x11b15010, vcpu=0) at libkvm.c:903 #2 0x00000000004e9426 in kvm_cpu_exec (env=<value optimized out>) at /usr/src/kvm-71/qemu/qemu-kvm.c:218 #3 0x00000000004e9700 in ap_main_loop (_env=<value optimized out>) at /usr/src/kvm-71/qemu/qemu-kvm.c:407 #4 0x000000368ca062e7 in start_thread () from /lib64/libpthread.so.0 #5 0x000000368bece3bd in clone () from /lib64/libc.so.6 Thread 3 (Thread 1087498560 (LWP 17007)): #0 0x000000368bec6fa7 in ioctl () from /lib64/libc.so.6 #1 0x000000000050f726 in kvm_run (kvm=0x11b15010, vcpu=1) at libkvm.c:903 #2 0x00000000004e9426 in kvm_cpu_exec (env=<value optimized out>) at /usr/src/kvm-71/qemu/qemu-kvm.c:218 #3 0x00000000004e9700 in ap_main_loop (_env=<value optimized out>) at /usr/src/kvm-71/qemu/qemu-kvm.c:407 #4 0x000000368ca062e7 in start_thread () from /lib64/libpthread.so.0 #5 0x000000368bece3bd in clone () from /lib64/libc.so.6 Thread 2 (Thread 1949133120 (LWP 17014)): #0 0x000000368ca0a687 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x0000003692202ee5 in handle_fildes_io () from /lib64/librt.so.1 #2 0x000000368ca062e7 in start_thread () from /lib64/libpthread.so.0 #3 0x000000368bece3bd in clone () from /lib64/libc.so.6 Thread 1 (Thread 47523282295136 (LWP 16990)): #0 0x000000368bec7922 in select () from /lib64/libc.so.6 #1 0x00000000004094b2 in main_loop_wait (timeout=<value optimized out>) at /usr/src/kvm-71/qemu/vl.c:7545 #2 0x00000000004e9342 in kvm_main_loop () at /usr/src/kvm-71/qemu/qemu-kvm.c:587 #3 0x0000000000411662 in main (argc=20, argv=0x7fffca7a9b38) at /usr/src/kvm-71/qemu/vl.c:7705 #0 0x000000368bec7922 in select () from /lib64/libc.so.6 ---------------------------------------------------------------------- >Comment By: Jes Sorensen (jessorensen) Date: 2010-11-30 12:39 Message: This looks like a duplicate of https://bugs.launchpad.net/qemu/+bug/587993 If you can reproduce this problem, it would be great if you can add the info to the bug in launchpad. Thanks, Jes ---------------------------------------------------------------------- Comment By: Rafal Wijata (ravpl) Date: 2008-08-13 10:53 Message: Logged In: YES user_id=996150 Originator: YES Another crash with guest bt, please advise how to debug? R13: ffff8100dd107000 R14: ffffffff80077090 R15: ffffffff80418e80 FS: 0000000000000000(0000) GS:ffffffff8039f000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00002aec1f42e000 CR3: 00000000d49e4000 CR4: 00000000000006e0 Call Trace: <IRQ> [<ffffffff8003eadd>] dev_watchdog+0x98/0xc0 [<ffffffff800953c2>] run_timer_softirq+0x133/0x1af [<ffffffff80011ed2>] __do_softirq+0x5e/0xd6 [<ffffffff8005e2fc>] call_softirq+0x1c/0x28 [<ffffffff8006c6e4>] do_softirq+0x2c/0x85 [<ffffffff8005dc8e>] apic_timer_interrupt+0x66/0x6c <EOI> [<ffffffff800d403a>] drain_array+0x28/0xc0 [<ffffffff800d4aea>] cache_reap+0x0/0x219 [<ffffffff800d4b8f>] cache_reap+0xa5/0x219 [<ffffffff8004cea9>] run_workqueue+0x94/0xe4 [<ffffffff800497be>] worker_thread+0x0/0x122 [<ffffffff800498ae>] worker_thread+0xf0/0x122 [<ffffffff8008ad76>] default_wake_function+0x0/0xe [<ffffffff8003253d>] kthread+0xfe/0x132 [<ffffffff8005dfb1>] child_rip+0xa/0x11 [<ffffffff8003243f>] kthread+0x0/0x132 [<ffffffff8005dfa7>] child_rip+0x0/0x11 ---------------------------------------------------------------------- Comment By: Rafal Wijata (ravpl) Date: 2008-08-13 08:40 Message: Logged In: YES user_id=996150 Originator: YES [guest] And finally the device gets offline [guest] sd 0:0:0:0: rejecting I/O to offline device Is it possible, that those problems come from the fact, that I have configured raw devices as kvm disks? Eg: -drive media=disk,if=scsi,boot=on,file=/dev/sdb2 -drive media=disk,if=scsi,boot=off,file=/dev/sdc2 ... ---------------------------------------------------------------------- Comment By: Rafal Wijata (ravpl) Date: 2008-08-11 15:26 Message: Logged In: YES user_id=996150 Originator: YES Update_1: while guest was panicking, I was able to see SEGV for it's host's qemu process. No core file though. I'll try next time happened on kvm72 as well. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2042889&group_id=180599 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html