Avi Kivity wrote: > Avi Kivity wrote: >> >> I suggest checking if you have the latest BIOS update applied. I've >> had bad experiences with un-updated processors. >> > > FWIW, I have an 8-way F9 guest (2.6.27.5-blah) running on an 2x4 > Barcelona host, happily make -j16ing an allmodconfig kernel. > Following the discussion on IRC, I tried -no-kvm-irqchip and found some virtual machines broken after >1 day of stress testing again: + sudo -u contain2 env -i qemu-kvm -localtime -kernel virtio-kernel -initrd virtio-initrd -nographic -append 'quiet clocksource=acpi_pm cifsuser=contain2 cifspass=contain2 root=cifs://contain2:contain2@xxxxx 6.2.1/contain2 realroot=//172.16.2.1/users/contain2 ip=172.16.2.2:172.16.2.1::255.255.255.0::eth0:none console=ttyS0 dhcp=off builder=1' -net nic,model=virtio,macaddr=52:54:00:12:34:2 -net tap,ifname=tap2,sc ript=/bin/true -m 2000 -nographic -smp 4 -no-kvm-irqchip /dev/null qemu: loading initrd (0x1daf359 bytes) at 0x000000007b240000 Stuck ?? Stuck ?? BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 IP: [<ffffffff802b539a>] kfree+0x18b/0x26e PGD 0 Oops: 0000 [1] SMP last sysfs file: CPU 2 Modules linked in: Supported: Yes Pid: 0, comm: swapper Tainted: G S 2.6.27.7-9-default #1 RIP: 0010:[<ffffffff802b539a>] [<ffffffff802b539a>] kfree+0x18b/0x26e RSP: 0018:ffff88007a493e90 EFLAGS: 00010046 RAX: 0000000000000002 RBX: ffff8800010397f0 RCX: ffff88007a480778 RDX: ffffe20000000000 RSI: ffff8800010397f0 RDI: ffff88007a5ae140 RBP: 0000000000000000 R08: ffff8800010395d0 R09: ffff88007a493eb8 R10: ffffffff80a59980 R11: ffffffff8021c5d9 R12: 0000000000000001 R13: ffff88007ac04080 R14: 0000000010200042 R15: ffff88007a5ae140 FS: 0000000000000000(0000) GS:ffff88007a461f40(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffff88007a48a000, task ffff88007a488280) Stack: ffffffff8023df9c ffffffff8073a108 0000000000000286 ffffffff8024a1eb ffffffff80259d80 ffff8800010397f0 0000000000000000 0000000000000001 000000000000000a 0000000010200042 0000000000000010 ffffffff802831d0 Call Trace: [<ffffffff802831d0>] __rcu_process_callbacks+0x189/0x203 [<ffffffff80283271>] rcu_process_callbacks+0x27/0x47 [<ffffffff802464ed>] __do_softirq+0x84/0x115 [<ffffffff8020dc9c>] call_softirq+0x1c/0x28 [<ffffffff8020f067>] do_softirq+0x3c/0x81 [<ffffffff80246204>] irq_exit+0x3f/0x83 [<ffffffff8021ce5f>] smp_apic_timer_interrupt+0x95/0xae [<ffffffff8020d4a3>] apic_timer_interrupt+0x83/0x90 [<ffffffff80221f1d>] native_safe_halt+0x2/0x3 [<ffffffff80213465>] default_idle+0x38/0x54 [<ffffffff8020b34a>] cpu_idle+0xa9/0xf1 Code: 01 00 00 00 e8 4c fa ff ff 48 83 3d a0 19 44 00 00 49 8b 44 dd 08 48 8d 78 40 75 04 0f 0b eb fe e8 e5 cc f6 ff 90 e9 c7 00 00 00 <8b> 55 00 3b 55 04 73 0f 89 d0 4c 89 7c c5 18 8d 42 01 e9 ad 00 RIP [<ffffffff802b539a>] kfree+0x18b/0x26e RSP <ffff88007a493e90> CR2: 0000000000000000 ---[ end trace 4eaa2a86a8e2da22 ]--- Also after two days of permanent stress testing I also got the Intel machine w/ current git down: + sudo -u contain1 env -i /usr/local/bin/qemu-system-x86_64 -localtime -kernel virtio-kernel -initrd virtio-initrd -nographic -append 'quiet clocksource=acpi_pm cifsuser=contain1 cifspass=contain1 root=cifs://contain1:contain1@xxxxxxxxxx/contain1 realroot=//172.16.1.1/users/contain1 ip=172.16.1.2:172.16.1.1::255.255.255.0::eth0:none console=ttyS0 dhcp=off builder=1' -net nic,model=virtio,macaddr=52:54:00:12:34:1 -net tap,ifname=tap1,script=/bin/true -m 2000 -nographic -smp 8 /dev/null qemu: loading initrd (0x1daf359 bytes) at 0x000000007b240000 Stuck ?? No backtrace here though. That's all I got from the serial console. The only issues I had with the UP guests so far was this: + taskset -c 6 sudo -u contain6 env -i qemu-kvm -localtime -kernel virtio-kernel -initrd virtio-initrd -nographic -append 'quiet clocksource=acpi_pm cifsuser=contain6 cifspass=contain6 root=cifs://contain6:contain6@xxxxxxxxxx/contain6 realroot=//172.16.6.1/users/contain6 ip=172.16.6.2:172.16.6.1::255.255.255.0::eth0:none console=ttyS0 dhcp=off builder=1' -net nic,model=virtio,macaddr=52:54:00:12:34:6 -net tap,ifname=tap6,script=/bin/true -m 2000 -nographic /dev/null qemu: loading initrd (0x1daf359 bytes) at 0x000000007b240000 ..MP-BIOS bug: 8254 timer not connected to IO-APIC Kernel panic - not syncing: IO-APIC + timer doesn't work! Boot with apic=debug and send a report. Then try booting with the 'noapic' option. which can be annoying at times too. Can't we just detect that it's the detection and give the guest its interrupts? Or should the PIT reinjection thing help here? Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html