Hi,all I have met some problems while utilizing KVM。 The test environment is: Summary: Dell R610, 1 x Xeon E5645 2.40GHz, 47.1GB / 48GB 1333MHz DDR3 System: Dell PowerEdge R610 (Dell 08GXHX) Processors: 1 (of 2) x Xeon E5645 2.40GHz 5860MHz FSB (HT enabled, 6 cores, 24 threads) Memory: 47.1GB / 48GB 1333MHz DDR3 == 12 x 4GB Disk: sda: 299GB (72%) JBOD Disk: sdb (host9): 5.0TB JBOD == 1 x VIRTUAL-DISK Disk: sdc (host11): 5.0TB JBOD == 1 x VIRTUAL-DISK Disk: sdd (host12): 5.0TB JBOD == 1 x VIRTUAL-DISK Disk: sde (host10): 5.0TB JBOD == 1 x VIRTUAL-DISK Disk-Control: mpt2sas0: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] Disk-Control: host9: Disk-Control: host10: Disk-Control: host11: Disk-Control: host12: Chipset: Intel 82801IB (ICH9) Network: br1 (bridge): 14:fe:b5:dc:2c:6e Network: em1 (bnx2): Broadcom NetXtreme II BCM5709 Gigabit, 14:fe:b5:dc:2c:6e, 1000Mb/s <full-duplex> Network: em2 (bnx2): Broadcom NetXtreme II BCM5709 Gigabit, 14:fe:b5:dc:2c:70, 1000Mb/s <full-duplex> Network: em3 (bnx2): Broadcom NetXtreme II BCM5709 Gigabit, 14:fe:b5:dc:2c:72, 1000Mb/s <full-duplex> Network: em4 (bnx2): Broadcom NetXtreme II BCM5709 Gigabit, 14:fe:b5:dc:2c:74, 1000Mb/s <full-duplex> Network: vnet0 (tun): fe:16:3e:49:fb:05, 10Mb/s <full-duplex> Network: vnet1 (tun): fe:16:3e:cb:c0:d1, 10Mb/s <full-duplex> Network: vnet2 (tun): fe:16:3e:1e:c1:c4, 10Mb/s <full-duplex> Network: vnet3 (tun): fe:16:3e:d5:58:f4, 10Mb/s <full-duplex> Network: vnet4 (tun): fe:16:3e:15:b4:16, 10Mb/s <full-duplex> Network: vnet5 (tun): fe:16:3e:d2:07:47, 10Mb/s <full-duplex> Network: vnet6 (tun): fe:16:3e:e1:2b:b9, 10Mb/s <full-duplex> OS: RHEL Server 6.1 (Santiago), Linux 2.6.32-220.2.1.el6.x86_64 x86_64, 64-bit BIOS: Dell 3.0.0 01/31/2011 And during the term i utilize KVM, some issues happen: 1. Host Crash Caused by a. Kernel Panic 31 KERNEL: /usr/lib/debug/lib/modules/2.6.32-131.12.1.el6.x86_64/vmlinux 32 DUMPFILE: ../vmcore_2012.13.46 [PARTIAL DUMP] 33 CPUS: 24 34 DATE: Wed Jan 11 13:34:13 2012 35 UPTIME: 25 days, 04:11:05 36 LOAD AVERAGE: 223.16, 172.97, 158.23 37 TASKS: 1464 38 NODENAME: dell2.localdomain 39 RELEASE: 2.6.32-131.12.1.el6.x86_64 40 VERSION: #1 SMP Sun Jul 31 16:44:56 EDT 2011 41 MACHINE: x86_64 (2394 Mhz) 42 MEMORY: 48 GB 43 PANIC: "kernel BUG at arch/x86/kernel/traps.c:547!" 44 PID: 11851 45 COMMAND: "qemu-kvm" 46 TASK: ffff880c071c3500 [THREAD_INFO: ffff880c132d8000] 47 CPU: 1 48 STATE: TASK_RUNNING (PANIC) 49 50 PID: 11851 TASK: ffff880c071c3500 CPU: 1 COMMAND: "qemu-kvm" 51 #0 [ffff880028207be0] machine_kexec at ffffffff810310cb 52 #1 [ffff880028207c40] crash_kexec at ffffffff810b6392 53 #2 [ffff880028207d10] oops_end at ffffffff814de670 54 #3 [ffff880028207d40] die at ffffffff8100f2eb 55 #4 [ffff880028207d70] do_trap at ffffffff814ddf64 56 #5 [ffff880028207dd0] do_invalid_op at ffffffff8100ceb5 57 #6 [ffff880028207e70] invalid_op at ffffffff8100bf5b 58 [exception RIP: do_nmi+554] 59 RIP: ffffffff814de43a RSP: ffff880028207f28 RFLAGS: 00010002 60 RAX: ffff880c132d9fd8 RBX: ffff880028207f58 RCX: 00000000c0000101 61 RDX: 00000000ffff8800 RSI: ffffffffffffffff RDI: ffff880028207f58 62 RBP: ffff880028207f48 R8: ffff88005ebf9800 R9: ffff880028203fc0 63 R10: 0000000000000034 R11: 00000000000003e8 R12: 000000000000cc20 64 R13: ffffffff816024a0 R14: ffff88005ebf9800 R15: 00007ffffffff000 65 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 66 #7 [ffff880028207f50] nmi at ffffffff814ddc90 67 [exception RIP: bad_to_user+37] 68 RIP: ffffffff814e4e2b RSP: ffff880028207bb0 RFLAGS: 00010046 69 RAX: ffff880c132d9fd8 RBX: ffff880c132d9c48 RCX: 0000000000000001 70 RDX: 0000000000000000 RSI: 000000010000000b RDI: ffff880028207c08 71 RBP: ffff880028207c48 R8: ffff88005ebf9800 R9: ffff880028203fc0 72 R10: 0000000000000034 R11: 00000000000003e8 R12: 000000000000cc20 73 R13: ffffffff816024a0 R14: ffff88005ebf9800 R15: 00007ffffffff000 74 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 75 --- <NMI exception stack> --- For this problem, i found that panic is caused by BUG_ON(in_nmi()) which means NMI happened during another NMI Context; But i check the Intel Technical Manual and found "While an NMI interrupt handler is executing, the processor disables additional calls to the NMI handler until the next IRET instruction is executed." So, how this happen? b. Qemu Process's CPU dead lock 28 KERNEL: /usr/lib/debug/lib/modules/2.6.32-131.12.1.el6.x86_64/vmlinux 29 DUMPFILE: /var/crash/127.0.0.1-2012-02-18-21:20:13/vmcore [PARTIAL DUMP] 30 CPUS: 24 31 DATE: Sat Feb 18 20:03:56 2012 32 UPTIME: 71 days, 09:42:23 33 LOAD AVERAGE: 46.81, 44.32, 35.15 34 TASKS: 1018 35 NODENAME: virt15-njhx-kvm-19 36 RELEASE: 2.6.32-131.12.1.el6.x86_64 37 VERSION: #1 SMP Sun Jul 31 16:44:56 EDT 2011 38 MACHINE: x86_64 (2394 Mhz) 39 MEMORY: 48 GB 40 PANIC: "Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 12" 41 PID: 18704 42 COMMAND: "qemu-kvm" 43 TASK: ffff880041efb580 [THREAD_INFO: ffff8807309ba000] 44 CPU: 12 45 STATE: TASK_RUNNING (PANIC) 46 47 crash> bt 48 PID: 18704 TASK: ffff880041efb580 CPU: 12 COMMAND: "qemu-kvm" 49 #0 [ffff8806454c7af0] machine_kexec at ffffffff810310cb 50 #1 [ffff8806454c7b50] crash_kexec at ffffffff810b6392 51 #2 [ffff8806454c7c20] panic at ffffffff814da64f 52 #3 [ffff8806454c7ca0] watchdog_overflow_callback at ffffffff810d648d 53 #4 [ffff8806454c7cc0] __perf_event_overflow at ffffffff81108b26 54 #5 [ffff8806454c7d60] perf_event_overflow at ffffffff81109119 55 #6 [ffff8806454c7d70] intel_pmu_handle_irq at ffffffff8101dd46 56 #7 [ffff8806454c7e80] perf_event_nmi_handler at ffffffff814debd8 57 #8 [ffff8806454c7ea0] notifier_call_chain at ffffffff814e0735 58 #9 [ffff8806454c7ee0] atomic_notifier_call_chain at ffffffff814e079a 59 #10 [ffff8806454c7ef0] notify_die at ffffffff8109411e 60 #11 [ffff8806454c7f20] do_nmi at ffffffff814de383 61 #12 [ffff8806454c7f50] nmi at ffffffff814ddc90 62 RIP: 00000000004083ab RSP: 00007fffc80115d8 RFLAGS: 00000206 63 RAX: 000000007e2bf790 RBX: 0000000001c753f0 RCX: 0000000000008000 64 RDX: 0000000000000000 RSI: 0000093b76bfc600 RDI: 000000001277546d 65 RBP: 0000000000000200 R8: 00000000fbc80000 R9: 0000000000000000 66 R10: 0000000000000064 R11: 0000000000000246 R12: 1277546d7d3d8c69 67 R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000 68 ORIG_RAX: ffffffffffffffff CS: 0033 SS: 002b 69 --- <NMI exception stack> --- 2. Guest Boot Hang when lots of guest create requests are processed at a same time by libvirt; The guest is configured with -smp 1. So, anyone has any idea about these? Thx -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html