Marcelo Tosatti wrote: > Hi Alexander, > > On Thu, Jan 22, 2009 at 09:29:46PM +0100, Alexander Graf wrote: > > >> Following the discussion on IRC, I tried -no-kvm-irqchip and found some >> virtual machines broken after >1 day of stress testing again: >> >> + sudo -u contain2 env -i qemu-kvm -localtime -kernel virtio-kernel >> -initrd virtio-initrd -nographic -append 'quiet clocksource=acpi_pm >> cifsuser=contain2 cifspass=contain2 root=cifs://contain2:contain2@xxxxx >> 6.2.1/contain2 realroot=//172.16.2.1/users/contain2 >> ip=172.16.2.2:172.16.2.1::255.255.255.0::eth0:none console=ttyS0 >> dhcp=off builder=1' -net nic,model=virtio,macaddr=52:54:00:12:34:2 -net >> tap,ifname=tap2,sc >> ript=/bin/true -m 2000 -nographic -smp 4 -no-kvm-irqchip /dev/null >> qemu: loading initrd (0x1daf359 bytes) at 0x000000007b240000 >> Stuck ?? >> Stuck ?? >> BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 >> IP: [<ffffffff802b539a>] kfree+0x18b/0x26e >> PGD 0 >> Oops: 0000 [1] SMP >> last sysfs file: >> CPU 2 >> Modules linked in: >> Supported: Yes >> Pid: 0, comm: swapper Tainted: G S 2.6.27.7-9-default #1 >> RIP: 0010:[<ffffffff802b539a>] [<ffffffff802b539a>] kfree+0x18b/0x26e >> RSP: 0018:ffff88007a493e90 EFLAGS: 00010046 >> RAX: 0000000000000002 RBX: ffff8800010397f0 RCX: ffff88007a480778 >> RDX: ffffe20000000000 RSI: ffff8800010397f0 RDI: ffff88007a5ae140 >> RBP: 0000000000000000 R08: ffff8800010395d0 R09: ffff88007a493eb8 >> R10: ffffffff80a59980 R11: ffffffff8021c5d9 R12: 0000000000000001 >> R13: ffff88007ac04080 R14: 0000000010200042 R15: ffff88007a5ae140 >> FS: 0000000000000000(0000) GS:ffff88007a461f40(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b >> CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> Process swapper (pid: 0, threadinfo ffff88007a48a000, task ffff88007a488280) >> Stack: ffffffff8023df9c ffffffff8073a108 0000000000000286 ffffffff8024a1eb >> ffffffff80259d80 ffff8800010397f0 0000000000000000 0000000000000001 >> 000000000000000a 0000000010200042 0000000000000010 ffffffff802831d0 >> Call Trace: >> [<ffffffff802831d0>] __rcu_process_callbacks+0x189/0x203 >> [<ffffffff80283271>] rcu_process_callbacks+0x27/0x47 >> [<ffffffff802464ed>] __do_softirq+0x84/0x115 >> [<ffffffff8020dc9c>] call_softirq+0x1c/0x28 >> [<ffffffff8020f067>] do_softirq+0x3c/0x81 >> [<ffffffff80246204>] irq_exit+0x3f/0x83 >> [<ffffffff8021ce5f>] smp_apic_timer_interrupt+0x95/0xae >> [<ffffffff8020d4a3>] apic_timer_interrupt+0x83/0x90 >> [<ffffffff80221f1d>] native_safe_halt+0x2/0x3 >> [<ffffffff80213465>] default_idle+0x38/0x54 >> [<ffffffff8020b34a>] cpu_idle+0xa9/0xf1 >> >> >> Code: 01 00 00 00 e8 4c fa ff ff 48 83 3d a0 19 44 00 00 49 8b 44 dd 08 >> 48 8d 78 40 75 04 0f 0b eb fe e8 e5 cc f6 ff 90 e9 c7 00 00 00 <8b> 55 >> 00 3b 55 04 73 0f 89 d0 4c 89 7c c5 18 8d 42 01 e9 ad 00 >> RIP [<ffffffff802b539a>] kfree+0x18b/0x26e >> RSP <ffff88007a493e90> >> CR2: 0000000000000000 >> ---[ end trace 4eaa2a86a8e2da22 ]--- >> >> >> Also after two days of permanent stress testing I also got the Intel >> machine w/ current git down: >> >> + sudo -u contain1 env -i /usr/local/bin/qemu-system-x86_64 -localtime >> -kernel virtio-kernel -initrd virtio-initrd -nographic -append 'quiet >> clocksource=acpi_pm cifsuser=contain1 cifspass=contain1 >> root=cifs://contain1:contain1@xxxxxxxxxx/contain1 >> realroot=//172.16.1.1/users/contain1 >> ip=172.16.1.2:172.16.1.1::255.255.255.0::eth0:none console=ttyS0 >> dhcp=off builder=1' -net nic,model=virtio,macaddr=52:54:00:12:34:1 -net >> tap,ifname=tap1,script=/bin/true -m 2000 -nographic -smp 8 /dev/null >> qemu: loading initrd (0x1daf359 bytes) at 0x000000007b240000 >> Stuck ?? >> >> No backtrace here though. That's all I got from the serial console. >> >> The only issues I had with the UP guests so far was this: >> >> + taskset -c 6 sudo -u contain6 env -i qemu-kvm -localtime -kernel >> virtio-kernel -initrd virtio-initrd -nographic -append 'quiet >> clocksource=acpi_pm cifsuser=contain6 cifspass=contain6 >> root=cifs://contain6:contain6@xxxxxxxxxx/contain6 >> realroot=//172.16.6.1/users/contain6 >> ip=172.16.6.2:172.16.6.1::255.255.255.0::eth0:none console=ttyS0 >> dhcp=off builder=1' -net nic,model=virtio,macaddr=52:54:00:12:34:6 -net >> tap,ifname=tap6,script=/bin/true -m 2000 -nographic /dev/null >> qemu: loading initrd (0x1daf359 bytes) at 0x000000007b240000 >> ..MP-BIOS bug: 8254 timer not connected to IO-APIC >> Kernel panic - not syncing: IO-APIC + timer doesn't work! Boot with >> apic=debug and send a report. Then try booting with the 'noapic' option. >> >> which can be annoying at times too. Can't we just detect that it's the >> detection and give the guest its interrupts? Or should the PIT >> reinjection thing help here? >> > > There are a number of problems that can result in this error, and the > problems are possibly different between the in-kernel PIT and userspace > PIT emulation (note it also happens with in-kernel PIT, just much more > rarely now). You can use the no_timer_check kernel option to bypass it. > Hm - that option disables the whole check, making it always fail. I haven't seen any way to actually disable the check, telling Linux things are OK :-(. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html