Hi Avi, On Sun, Feb 13, 2011 at 13:58, Avi Kivity <avi@xxxxxxxxxx> wrote: > On 02/10/2011 05:23 PM, Ruben Kerkhof wrote: >> >> This machine has been running for a week without problems, but then we >> started to get the following oopses again: >> >> 2011-02-06T19:45:35.221555+01:00 phy005 kernel: BUG: unable to handle >> kernel paging request at ffffea71929180e0 >> 2011-02-06T19:45:35.222194+01:00 phy005 kernel: IP: >> [<ffffffff81034880>] gup_pte_range+0x94/0xd3 >> 2011-02-06T19:45:35.222199+01:00 phy005 kernel: PGD 118600067 PUD 0 >> 2011-02-06T19:45:35.222203+01:00 phy005 kernel: Oops: 0000 [#1] SMP >> 2011-02-06T19:45:35.222221+01:00 phy005 kernel: last sysfs file: >> /sys/devices/system/cpu/cpu15/topology/thread_siblings >> 2011-02-06T19:45:35.222224+01:00 phy005 kernel: CPU 4 >> 2011-02-06T19:45:35.222229+01:00 phy005 kernel: Modules linked in: tun >> ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding >> xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter >> ip6_tables ipv6 kvm_intel kvm i2c_i801 i2c_core iTCO_wdt serio_raw igb >> iTCO_vendor_support joydev ioatdma dca 3w_9xxx [last unloaded: >> scsi_wait_scan] >> 2011-02-06T19:45:35.222231+01:00 phy005 kernel: >> 2011-02-06T19:45:35.222233+01:00 phy005 kernel: Pid: 3650, comm: >> qemu-kvm Not tainted 2.6.34.7-66.tilaa.fc13.x86_64 #1 X8DTU/X8DTU >> 2011-02-06T19:45:35.222236+01:00 phy005 kernel: RIP: >> 0010:[<ffffffff81034880>] Â[<ffffffff81034880>] >> gup_pte_range+0x94/0xd3 >> 2011-02-06T19:45:35.222239+01:00 phy005 kernel: RSP: >> 0018:ffff88060b9bda78 ÂEFLAGS: 00010082 >> 2011-02-06T19:45:35.222241+01:00 phy005 kernel: RAX: ffffea71929180e0 >> RBX: 00003ffffffff000 RCX: 0000000000000005 >> 2011-02-06T19:45:35.222243+01:00 phy005 kernel: RDX: 00007fe54e400000 >> RSI: 00007fe54e3ff000 RDI: 1603a07305004067 >> 2011-02-06T19:45:35.222245+01:00 phy005 kernel: RBP: ffff88060b9bda98 >> R08: ffff880b94384560 R09: ffff88060b9bdb44 >> 2011-02-06T19:45:35.222248+01:00 phy005 kernel: R10: ffff880606b2fff8 >> R11: ffffea0000000000 R12: 0000000000000205 >> 2011-02-06T19:45:35.222251+01:00 phy005 kernel: R13: ffffc00000000fff >> R14: 0000000000000005 R15: 0000000000000000 >> 2011-02-06T19:45:35.222255+01:00 phy005 kernel: FS: >> 00007fe64cb0e700(0000) GS:ffff880655400000(0000) >> knlGS:0000000000000000 >> 2011-02-06T19:45:35.222259+01:00 phy005 kernel: CS: Â0010 DS: 002b ES: >> 002b CR0: 0000000080050033 >> 2011-02-06T19:45:35.222263+01:00 phy005 kernel: CR2: ffffea71929180e0 >> CR3: 0000000bff06d000 CR4: 00000000000026e0 >> 2011-02-06T19:45:35.222267+01:00 phy005 kernel: DR0: 0000000000000000 >> DR1: 0000000000000000 DR2: 0000000000000000 >> 2011-02-06T19:45:35.222271+01:00 phy005 kernel: DR3: 0000000000000000 >> DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> 2011-02-06T19:45:35.222274+01:00 phy005 kernel: Process qemu-kvm (pid: >> 3650, threadinfo ffff88060b9bc000, task ffff880623ed2ee0) >> 2011-02-06T19:45:35.222278+01:00 phy005 kernel: Stack: >> 2011-02-06T19:45:35.222281+01:00 phy005 kernel: 00007fe54e400000 >> 00007fe54e400000 00007fe54e400000 ffff88053a0d2388 >> 2011-02-06T19:45:35.222285+01:00 phy005 kernel:<0> Âffff88060b9bdaf8 >> ffffffff81034a15 00007fe54e3fffff 00007fe54e3fffff >> 2011-02-06T19:45:35.222289+01:00 phy005 kernel:<0> Âffff88060b9bdb44 >> ffff880b94384560 ffff880bff06eca8 ffff880bff06d7f8 >> 2011-02-06T19:45:35.222292+01:00 phy005 kernel: Call Trace: >> 2011-02-06T19:45:35.222296+01:00 phy005 kernel: [<ffffffff81034a15>] >> gup_pud_range+0x156/0x192 >> 2011-02-06T19:45:35.222300+01:00 phy005 kernel: [<ffffffff81034b15>] >> get_user_pages_fast+0xc4/0x172 >> 2011-02-06T19:45:35.222304+01:00 phy005 kernel: [<ffffffff81131fbc>] ? >> bio_add_page+0x36/0x38 >> 2011-02-06T19:45:35.222308+01:00 phy005 kernel: [<ffffffff81134730>] >> dio_get_page+0x54/0x127 >> 2011-02-06T19:45:35.222312+01:00 phy005 kernel: [<ffffffff81135317>] >> __blockdev_direct_IO+0x41d/0xa36 >> 2011-02-06T19:45:35.222316+01:00 phy005 kernel: [<ffffffffa0080f69>] ? >> x86_emulate_insn+0x1ff8/0x2d61 [kvm] >> 2011-02-06T19:45:35.222320+01:00 phy005 kernel: [<ffffffff8113379b>] >> blkdev_direct_IO+0x4e/0x50 >> 2011-02-06T19:45:35.222324+01:00 phy005 kernel: [<ffffffff81132c49>] ? >> blkdev_get_blocks+0x0/0x8d >> 2011-02-06T19:45:35.222328+01:00 phy005 kernel: [<ffffffff810cb516>] >> generic_file_direct_write+0xed/0x16d >> 2011-02-06T19:45:35.222331+01:00 phy005 kernel: [<ffffffff810cb72c>] >> __generic_file_aio_write+0x196/0x281 >> 2011-02-06T19:45:35.222335+01:00 phy005 kernel: [<ffffffff811d5352>] ? >> file_has_perm+0xa4/0xc6 >> 2011-02-06T19:45:35.222339+01:00 phy005 kernel: [<ffffffff81133043>] ? >> blkdev_aio_write+0x0/0x69 >> 2011-02-06T19:45:35.222343+01:00 phy005 kernel: [<ffffffff8113306d>] >> blkdev_aio_write+0x2a/0x69 >> 2011-02-06T19:45:35.222347+01:00 phy005 kernel: [<ffffffff81133043>] ? >> blkdev_aio_write+0x0/0x69 >> 2011-02-06T19:45:35.222351+01:00 phy005 kernel: [<ffffffff8113d4eb>] >> aio_rw_vect_retry+0x85/0x18e >> 2011-02-06T19:45:35.222355+01:00 phy005 kernel: [<ffffffff8113e9b3>] >> aio_run_iocb+0x77/0x10f >> 2011-02-06T19:45:35.222359+01:00 phy005 kernel: [<ffffffff8113f508>] >> do_io_submit+0x558/0x7ce >> 2011-02-06T19:45:35.222363+01:00 phy005 kernel: [<ffffffff8113f78e>] >> sys_io_submit+0x10/0x12 >> 2011-02-06T19:45:35.222366+01:00 phy005 kernel: [<ffffffff81009c72>] >> system_call_fastpath+0x16/0x1b >> 2011-02-06T19:45:35.222372+01:00 phy005 kernel: Code: 21 d8 49 01 c2 >> 49 8b 3a 49 89 fe 4d 21 ee 4d 21 e6 49 39 ce 75 49 48 89 f8 0f 1f 40 >> 00 48 21 d8 48 c1 e8 0c 48 6b c0 38 4c 01 d8<66> Â83 38 00 48 89 c7 79 >> 04 48 8b 78 10 f0 ff 47 08 49 63 39 48 >> 2011-02-06T19:45:35.222376+01:00 phy005 kernel: RIP >> [<ffffffff81034880>] gup_pte_range+0x94/0xd3 >> 2011-02-06T19:45:35.222379+01:00 phy005 kernel: RSP<ffff88060b9bda78> >> 2011-02-06T19:45:35.222382+01:00 phy005 kernel: CR2: ffffea71929180e0 >> 2011-02-06T19:45:35.222386+01:00 phy005 kernel: ---[ end trace >> beed2b54d0bb8a00 ]--- >> > > Hm, outside any kvm code. > >> and >> >> 2011-02-06T19:47:15.023129+01:00 phy005 kernel: qemu-kvm: Corrupted >> page table at address 7fbde15ff64c >> 2011-02-06T19:47:15.023207+01:00 phy005 kernel: PGD 5ff58a067 PUD >> 612668067 PMD 5937b7067 PE 1603a07305008067 > > Again outside kvm, and again the magic pte 1603axxxxx. > > >> followed by >> >> 2011-02-06T21:20:32.882972+01:00 phy005 kernel: BUG: unable to handle >> kernel paging request at fffff6b192918010 >> 2011-02-06T21:20:32.883252+01:00 phy005 kernel: IP: >> [<ffffffffa0078826>] kvm_mmu_zap_page+0x28a/0x299 [kvm] > > Well, after something goes bad, nothing good can come out of it. > >> after which we rebooted the machine and replaced the motherboard and >> cpus (we already replaced the memory before). >> >> But 2 days ago we got this oops: >> >> 2011-02-08T15:56:19.902104+01:00 phy005 kernel: BUG: unable to handle >> kernel paging request at ffffea71929181c0 >> 2011-02-08T15:56:19.902686+01:00 phy005 kernel: IP: >> [<ffffffff81034880>] gup_pte_range+0x94/0xd3 >> 2011-02-08T15:56:19.902693+01:00 phy005 kernel: PGD 118600067 PUD 0 >> 2011-02-08T15:56:19.902699+01:00 phy005 kernel: Oops: 0000 [#1] SMP >> 2011-02-08T15:56:19.902703+01:00 phy005 kernel: last sysfs file: >> /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_m >> ap >> 2011-02-08T15:56:19.902708+01:00 phy005 kernel: CPU 8 >> 2011-02-08T15:56:19.902715+01:00 phy005 kernel: Modules linked in: tun >> ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q >> Âgarp stp llc bonding xt_comment xt_recent ip6t_REJECT >> nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm i >> gb i2c_i801 iTCO_wdt ioatdma i2c_core iTCO_vendor_support dca >> serio_raw joydev 3w_9xxx [last unloaded: scsi_wait_scan] >> 2011-02-08T15:56:19.902770+01:00 phy005 kernel: >> 2011-02-08T15:56:19.902775+01:00 phy005 kernel: Pid: 3346, comm: >> qemu-kvm Not tainted 2.6.34.7-66.tilaa.fc13.x86_64 #1 X >> 8DTU/X8DTU >> 2011-02-08T15:56:19.902781+01:00 phy005 kernel: RIP: >> 0010:[<ffffffff81034880>] Â[<ffffffff81034880>] gup_pte_range+0x94/ >> 0xd3 >> 2011-02-08T15:56:19.902785+01:00 phy005 kernel: RSP: >> 0018:ffff880c21bc1a78 ÂEFLAGS: 00010086 >> 2011-02-08T15:56:19.902789+01:00 phy005 kernel: RAX: ffffea71929181c0 >> RBX: 00003ffffffff000 RCX: 0000000000000005 >> 2011-02-08T15:56:19.902793+01:00 phy005 kernel: RDX: 00007fa2ca200000 >> RSI: 00007fa2ca1ff000 RDI: 1603a07305008067 >> 2011-02-08T15:56:19.902797+01:00 phy005 kernel: RBP: ffff880c21bc1a98 >> R08: ffff88060fdfad60 R09: ffff880c21bc1b44 >> 2011-02-08T15:56:19.902801+01:00 phy005 kernel: R10: ffff88061493fff8 >> R11: ffffea0000000000 R12: 0000000000000205 >> 2011-02-08T15:56:19.902805+01:00 phy005 kernel: R13: ffffc00000000fff >> R14: 0000000000000005 R15: 0000000000000000 >> 2011-02-08T15:56:19.902810+01:00 phy005 kernel: FS: >> 00007fa2d8724700(0000) GS:ffff880002080000(0000) knlGS:000000000000 >> 0000 >> 2011-02-08T15:56:19.902820+01:00 phy005 kernel: CS: Â0010 DS: 002b ES: >> 002b CR0: 0000000080050033 >> 2011-02-08T15:56:19.902825+01:00 phy005 kernel: CR2: ffffea71929181c0 >> CR3: 0000000c231f9000 CR4: 00000000000026e0 >> 2011-02-08T15:56:19.902829+01:00 phy005 kernel: DR0: 0000000000000000 >> DR1: 0000000000000000 DR2: 0000000000000000 >> 2011-02-08T15:56:19.902833+01:00 phy005 kernel: DR3: 0000000000000000 >> DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> 2011-02-08T15:56:19.902837+01:00 phy005 kernel: Process qemu-kvm (pid: >> 3346, threadinfo ffff880c21bc0000, task ffff880c2 >> 264ddc0) >> 2011-02-08T15:56:19.902841+01:00 phy005 kernel: Stack: >> 2011-02-08T15:56:19.902844+01:00 phy005 kernel: 00007fa2ca200000 >> 00007fa2ca201000 00007fa2ca201000 ffff880c22c3d280 >> 2011-02-08T15:56:19.902848+01:00 phy005 kernel:<0> Âffff880c21bc1af8 >> ffffffff81034a15 00007fa2ca200fff 00007fa2ca200fff >> 2011-02-08T15:56:19.902852+01:00 phy005 kernel:<0> Âffff880c21bc1b44 >> ffff88060fdfad60 ffff880c2231a458 ffff880c231f97f8 >> 2011-02-08T15:56:19.902855+01:00 phy005 kernel: Call Trace: >> 2011-02-08T15:56:19.902859+01:00 phy005 kernel: [<ffffffff81034a15>] >> gup_pud_range+0x156/0x192 >> 2011-02-08T15:56:19.902863+01:00 phy005 kernel: [<ffffffff81034b15>] >> get_user_pages_fast+0xc4/0x172 >> 2011-02-08T15:56:19.902867+01:00 phy005 kernel: [<ffffffff81131fbc>] ? >> bio_add_page+0x36/0x38 >> 2011-02-08T15:56:19.902871+01:00 phy005 kernel: [<ffffffff81134730>] >> dio_get_page+0x54/0x127 >> 2011-02-08T15:56:19.902875+01:00 phy005 kernel: [<ffffffff81135317>] >> __blockdev_direct_IO+0x41d/0xa36 >> 2011-02-08T15:56:19.902880+01:00 phy005 kernel: [<ffffffffa008bf69>] ? >> x86_emulate_insn+0x1ff8/0x2d61 [kvm] >> 2011-02-08T15:56:19.902884+01:00 phy005 kernel: [<ffffffff8113379b>] >> blkdev_direct_IO+0x4e/0x50 >> 2011-02-08T15:56:19.902888+01:00 phy005 kernel: [<ffffffff81132c49>] ? >> blkdev_get_blocks+0x0/0x8d >> 2011-02-08T15:56:19.902892+01:00 phy005 kernel: [<ffffffff810cb516>] >> generic_file_direct_write+0xed/0x16d >> 2011-02-08T15:56:19.902896+01:00 phy005 kernel: [<ffffffff810cb72c>] >> __generic_file_aio_write+0x196/0x281 >> 2011-02-08T15:56:19.902899+01:00 phy005 kernel: [<ffffffff81133043>] ? >> blkdev_aio_write+0x0/0x69 >> 2011-02-08T15:56:19.902909+01:00 phy005 kernel: [<ffffffff81133043>] ? >> blkdev_aio_write+0x0/0x69 >> 2011-02-08T15:56:19.902914+01:00 phy005 kernel: [<ffffffff8113d4eb>] >> aio_rw_vect_retry+0x85/0x18e >> 2011-02-08T15:56:19.902919+01:00 phy005 kernel: [<ffffffff8113e9b3>] >> aio_run_iocb+0x77/0x10f >> 2011-02-08T15:56:19.902923+01:00 phy005 kernel: [<ffffffff8113f508>] >> do_io_submit+0x558/0x7ce >> 2011-02-08T15:56:19.902927+01:00 phy005 kernel: [<ffffffff8113f78e>] >> sys_io_submit+0x10/0x12 >> 2011-02-08T15:56:19.902932+01:00 phy005 kernel: [<ffffffff81009c72>] >> system_call_fastpath+0x16/0x1b >> 2011-02-08T15:56:19.902938+01:00 phy005 kernel: Code: 21 d8 49 01 c2 >> 49 8b 3a 49 89 fe 4d 21 ee 4d 21 e6 49 39 ce 75 49 48 89 f8 0f 1f 40 >> 00 48 21 d8 48 c1 e8 0c 48 6b c0 38 4c 01 d8<66> Â83 38 00 48 89 c7 79 >> 04 48 8b 78 10 f0 ff 47 08 49 63 39 48 >> 2011-02-08T15:56:19.903077+01:00 phy005 kernel: RIP >> [<ffffffff81034880>] gup_pte_range+0x94/0xd3 >> 2011-02-08T15:56:19.903081+01:00 phy005 kernel: RSP<ffff880c21bc1a78> >> 2011-02-08T15:56:19.903084+01:00 phy005 kernel: CR2: ffffea71929181c0 >> 2011-02-08T15:56:19.903088+01:00 phy005 kernel: ---[ end trace >> 174c28940e9fd0a7 ]--- >> > > Again outside kvm. > >> and yesterday this one: >> >> 2011-02-09T07:40:15.636528+01:00 phy005 kernel: BUG: unable to handle >> kernel NULL pointer dereference at (null) >> 2011-02-09T07:40:15.636635+01:00 phy005 kernel: IP: >> [<ffffffffa0082db8>] gfn_to_rmap+0x20/0x6e [kvm] >> 2011-02-09T07:40:15.636639+01:00 phy005 kernel: PGD 0 >> 2011-02-09T07:40:15.636643+01:00 phy005 kernel: Oops: 0000 [#3] SMP >> 2011-02-09T07:40:15.636647+01:00 phy005 kernel: last sysfs file: >> /sys/devices/system/cpu/cpu15/topology/thread_siblings >> 2011-02-09T07:40:15.636650+01:00 phy005 kernel: CPU 2 >> 2011-02-09T07:40:15.636656+01:00 phy005 kernel: Modules linked in: tun >> ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding >> xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter >> ip6_tables ipv6 kvm_intel kvm igb i2c_i801 iTCO_wdt ioatdma i2c_core >> iTCO_vendor_support dca serio_raw joydev 3w_9xxx [last unloaded: >> scsi_wait_scan] >> 2011-02-09T07:40:15.636663+01:00 phy005 kernel: >> 2011-02-09T07:40:15.636666+01:00 phy005 kernel: Pid: 2572, comm: >> qemu-kvm Tainted: G Â Â ÂD Â Â2.6.34.7-66.tilaa.fc13.x86_64 #1 >> X8DTU/X8DTU >> 2011-02-09T07:40:15.636670+01:00 phy005 kernel: RIP: >> 0010:[<ffffffffa0082db8>] Â[<ffffffffa0082db8>] gfn_to_rmap+0x20/0x6e >> [kvm] >> 2011-02-09T07:40:15.636673+01:00 phy005 kernel: RSP: >> 0018:ffff88061cbcbcd8 ÂEFLAGS: 00010246 >> 2011-02-09T07:40:15.636677+01:00 phy005 kernel: RAX: 0000000000000000 >> RBX: 1603a07305004fff RCX: ffff88061cbcbd08 >> 2011-02-09T07:40:15.636680+01:00 phy005 kernel: RDX: 0000000000000023 >> RSI: 1603a07305004fff RDI: 0000000000000000 >> 2011-02-09T07:40:15.636683+01:00 phy005 kernel: RBP: ffff88061cbcbce8 >> R08: 0000000000000023 R09: 0000000000000000 >> 2011-02-09T07:40:15.636686+01:00 phy005 kernel: R10: 0000000000000000 >> R11: ffffffffa0082c7f R12: 0000000000000001 >> 2011-02-09T07:40:15.636689+01:00 phy005 kernel: R13: 0000000000311763 >> R14: ffff8809b8b01ce0 R15: 0000000000000000 >> 2011-02-09T07:40:15.636692+01:00 phy005 kernel: FS: >> 0000000000000000(0000) GS:ffff880002040000(0000) >> knlGS:0000000000000000 >> 2011-02-09T07:40:15.636695+01:00 phy005 kernel: CS: Â0010 DS: 0000 ES: >> 0000 CR0: 000000008005003b >> 2011-02-09T07:40:15.636699+01:00 phy005 kernel: CR2: 0000000000000000 >> CR3: 0000000001a42000 CR4: 00000000000026e0 >> 2011-02-09T07:40:15.636702+01:00 phy005 kernel: DR0: 0000000000000000 >> DR1: 0000000000000000 DR2: 0000000000000000 >> 2011-02-09T07:40:15.636705+01:00 phy005 kernel: DR3: 0000000000000000 >> DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> 2011-02-09T07:40:15.636709+01:00 phy005 kernel: Process qemu-kvm (pid: >> 2572, threadinfo ffff88061cbca000, task ffff88061cf04650) >> 2011-02-09T07:40:15.636711+01:00 phy005 kernel: Stack: >> 2011-02-09T07:40:15.636715+01:00 phy005 kernel: ffff88036c471ff8 >> ffff880c23984000 ffff88061cbcbd18 ffffffffa0082ea9 >> 2011-02-09T07:40:15.636718+01:00 phy005 kernel:<0> Âffff8809b8b01ce0 >> ffff880c23984000 ffff88036c471ff8 00000000000001ff >> 2011-02-09T07:40:15.636721+01:00 phy005 kernel:<0> Âffff88061cbcbd58 >> ffffffffa008363b 0000000000000200 ffff880c23984000 >> 2011-02-09T07:40:15.636724+01:00 phy005 kernel: Call Trace: >> 2011-02-09T07:40:15.636728+01:00 phy005 kernel: [<ffffffffa0082ea9>] >> rmap_remove+0xa3/0x1a0 [kvm] >> 2011-02-09T07:40:15.636731+01:00 phy005 kernel: [<ffffffffa008363b>] >> kvm_mmu_zap_page+0x9f/0x299 [kvm] >> 2011-02-09T07:40:15.636734+01:00 phy005 kernel: [<ffffffffa0083a42>] >> kvm_mmu_zap_all+0x35/0x60 [kvm] >> 2011-02-09T07:40:15.636738+01:00 phy005 kernel: [<ffffffffa0078cde>] >> kvm_arch_flush_shadow+0x16/0x22 [kvm] >> 2011-02-09T07:40:15.636741+01:00 phy005 kernel: [<ffffffffa006eb0a>] >> kvm_mmu_notifier_release+0x31/0x44 [kvm] >> 2011-02-09T07:40:15.636744+01:00 phy005 kernel: [<ffffffff810fac37>] >> __mmu_notifier_release+0x4f/0x7b >> 2011-02-09T07:40:15.636748+01:00 phy005 kernel: [<ffffffff810e735d>] >> exit_mmap+0x2c/0x132 >> 2011-02-09T07:40:15.636751+01:00 phy005 kernel: [<ffffffff8104ad7a>] >> mmput+0x5e/0xca >> 2011-02-09T07:40:15.636754+01:00 phy005 kernel: [<ffffffff8104f0d5>] >> exit_mm+0x114/0x121 >> 2011-02-09T07:40:15.636757+01:00 phy005 kernel: [<ffffffff81050bf5>] >> do_exit+0x254/0x752 >> 2011-02-09T07:40:15.636760+01:00 phy005 kernel: [<ffffffff8100a60e>] ? >> apic_timer_interrupt+0xe/0x20 >> 2011-02-09T07:40:15.636764+01:00 phy005 kernel: [<ffffffff81051174>] >> do_group_exit+0x81/0xab >> 2011-02-09T07:40:15.636767+01:00 phy005 kernel: [<ffffffff810511b5>] >> sys_exit_group+0x17/0x1b >> 2011-02-09T07:40:15.636771+01:00 phy005 kernel: [<ffffffff81009c72>] >> system_call_fastpath+0x16/0x1b >> 2011-02-09T07:40:15.636777+01:00 phy005 kernel: Code: 88 ff ff ff b8 >> 01 00 00 00 c9 c3 55 48 89 e5 41 54 53 0f 1f 44 00 00 41 89 d4 48 89 >> f3 e8 7b c7 fe ff 41 83 fc 01 48 89 c7 75 0d<48> Â2b 18 48 c1 e3 03 48 >> 03 58 18 eb 39 41 8d 4c 24 ff be 01 00 >> 2011-02-09T07:40:15.636785+01:00 phy005 kernel: RIP >> [<ffffffffa0082db8>] gfn_to_rmap+0x20/0x6e [kvm] >> 2011-02-09T07:40:15.636788+01:00 phy005 kernel: RSP<ffff88061cbcbcd8> >> 2011-02-09T07:40:15.636791+01:00 phy005 kernel: CR2: 0000000000000000 >> 2011-02-09T07:40:15.637743+01:00 phy005 kernel: ---[ end trace >> 174c28940e9fd0a9 ]--- >> 2011-02-09T07:40:15.637751+01:00 phy005 kernel: Fixing recursive fault >> but reboot is needed! >> > > In kvm. ÂWas there a reboot between the two? No, there wasn't. I've just looked back at the logs and there was another oops in between: 2011-02-09T04:28:01.890999+01:00 phy005 kernel: general protection fault: 0000 [#2] SMP 2011-02-09T04:28:01.891122+01:00 phy005 kernel: last sysfs file: /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_m ap 2011-02-09T04:28:01.891127+01:00 phy005 kernel: CPU 12 2011-02-09T04:28:01.891137+01:00 phy005 kernel: Modules linked in: tun ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm i gb i2c_i801 iTCO_wdt ioatdma i2c_core iTCO_vendor_support dca serio_raw joydev 3w_9xxx [last unloaded: scsi_wait_scan] 2011-02-09T04:28:01.891144+01:00 phy005 kernel: 2011-02-09T04:28:01.891148+01:00 phy005 kernel: Pid: 19782, comm: find Tainted: G D 2.6.34.7-66.tilaa.fc13.x86_6 4 #1 X8DTU/X8DTU 2011-02-09T04:28:01.891154+01:00 phy005 kernel: RIP: 0010:[<ffffffff81158aa4>] [<ffffffff81158aa4>] proc_fd_instantiate +0x88/0x127 2011-02-09T04:28:01.891157+01:00 phy005 kernel: RSP: 0018:ffff880245677da8 EFLAGS: 00010206 2011-02-09T04:28:01.891161+01:00 phy005 kernel: RAX: 1603a07305000000 RBX: ffff8808076ada40 RCX: ffff88058bbbddc0 2011-02-09T04:28:01.891164+01:00 phy005 kernel: RDX: 000000000000022a RSI: ffff8808076ada40 RDI: ffff88062293ee80 2011-02-09T04:28:01.891168+01:00 phy005 kernel: RBP: ffff880245677dc8 R08: ffff8808076a91d0 R09: ffffffff81158a1c 2011-02-09T04:28:01.891172+01:00 phy005 kernel: R10: 0000000000000002 R11: ffff880245677d08 R12: ffff88062293ee00 2011-02-09T04:28:01.891176+01:00 phy005 kernel: R13: ffff8805b3897bf8 R14: ffff8808076a9430 R15: ffff8807ddd76c00 2011-02-09T04:28:01.891180+01:00 phy005 kernel: FS: 00007f09aa8e07a0(0000) GS:ffff880655480000(0000) knlGS:000000000000 0000 2011-02-09T04:28:01.891184+01:00 phy005 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2011-02-09T04:28:01.891188+01:00 phy005 kernel: CR2: 0000000000e43080 CR3: 00000007d6d6c000 CR4: 00000000000026e0 2011-02-09T04:28:01.891192+01:00 phy005 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2011-02-09T04:28:01.891196+01:00 phy005 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2011-02-09T04:28:01.891199+01:00 phy005 kernel: Process find (pid: 19782, threadinfo ffff880245676000, task ffff88058bbb 8000) 2011-02-09T04:28:01.891202+01:00 phy005 kernel: Stack: 2011-02-09T04:28:01.891206+01:00 phy005 kernel: ffff880245677e78 0000000000000003 ffff8802bfe0af00 ffff8808076ada40 2011-02-09T04:28:01.891209+01:00 phy005 kernel: <0> ffff880245677e38 ffffffff811564b8 ffff880245677e38 ffffffff81158a1c 2011-02-09T04:28:01.891213+01:00 phy005 kernel: <0> ffffffff8111b530 ffff880245677f38 0000000300119d45 ffff880245677e78 2011-02-09T04:28:01.891216+01:00 phy005 kernel: Call Trace: 2011-02-09T04:28:01.891220+01:00 phy005 kernel: [<ffffffff811564b8>] proc_fill_cache+0xa7/0x13f 2011-02-09T04:28:01.891224+01:00 phy005 kernel: [<ffffffff81158a1c>] ? proc_fd_instantiate+0x0/0x127 2011-02-09T04:28:01.891227+01:00 phy005 kernel: [<ffffffff8111b530>] ? filldir+0x0/0xd0 2011-02-09T04:28:01.891231+01:00 phy005 kernel: [<ffffffff8111b530>] ? filldir+0x0/0xd0 2011-02-09T04:28:01.891235+01:00 phy005 kernel: [<ffffffff811586c8>] proc_readfd_common+0x159/0x1a3 2011-02-09T04:28:01.891239+01:00 phy005 kernel: [<ffffffff81158a1c>] ? proc_fd_instantiate+0x0/0x127 2011-02-09T04:28:01.891242+01:00 phy005 kernel: [<ffffffff8111b530>] ? filldir+0x0/0xd0 2011-02-09T04:28:01.891246+01:00 phy005 kernel: [<ffffffff8115873e>] proc_readfd+0x15/0x17 2011-02-09T04:28:01.891250+01:00 phy005 kernel: [<ffffffff8111b731>] vfs_readdir+0x77/0xb4 2011-02-09T04:28:01.891254+01:00 phy005 kernel: [<ffffffff8111b8b7>] sys_getdents+0x81/0xd1 2011-02-09T04:28:01.891258+01:00 phy005 kernel: [<ffffffff81009c72>] system_call_fastpath+0x16/0x1b 2011-02-09T04:28:01.891263+01:00 phy005 kernel: Code: e8 08 3e 2f 00 49 8b 44 24 08 44 3b 28 0f 83 9c 00 00 00 45 89 ed 49 c1 e5 03 4c 03 68 08 49 8b 45 00 48 85 c0 0f 84 84 00 00 00 <f6> 40 3c 01 74 0a 66 41 81 8e aa 00 00 00 40 01 f6 40 3c 02 74 2011-02-09T04:28:01.891275+01:00 phy005 kernel: RIP [<ffffffff81158aa4>] proc_fd_instantiate+0x88/0x127 2011-02-09T04:28:01.891279+01:00 phy005 kernel: RSP <ffff880245677da8> 2011-02-09T04:28:01.891283+01:00 phy005 kernel: ---[ end trace 174c28940e9fd0a8 ]--- > >> So it doesn't seem to be a hardware problem since we replaced all that. > > I agree. ÂAnd your other machines are stable? Yes, the other ones have been running for ages without problems. We've been using 2.6.34.7 for about three months now. > When you say "identical software", are those exactly the same binaries? Yes, the same (kickstarted) install, the same rpms. > copying Andrea for possible insight into the non-kvm oopses. > > -- > error compiling committee.c: too many arguments to function Kind regards, Ruben -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html