Re: EPT: Misconfiguration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Avi,

On Sun, Feb 13, 2011 at 13:58, Avi Kivity <avi@xxxxxxxxxx> wrote:
> On 02/10/2011 05:23 PM, Ruben Kerkhof wrote:
>>
>> This machine has been running for a week without problems, but then we
>> started to get the following oopses again:
>>
>> 2011-02-06T19:45:35.221555+01:00 phy005 kernel: BUG: unable to handle
>> kernel paging request at ffffea71929180e0
>> 2011-02-06T19:45:35.222194+01:00 phy005 kernel: IP:
>> [<ffffffff81034880>] gup_pte_range+0x94/0xd3
>> 2011-02-06T19:45:35.222199+01:00 phy005 kernel: PGD 118600067 PUD 0
>> 2011-02-06T19:45:35.222203+01:00 phy005 kernel: Oops: 0000 [#1] SMP
>> 2011-02-06T19:45:35.222221+01:00 phy005 kernel: last sysfs file:
>> /sys/devices/system/cpu/cpu15/topology/thread_siblings
>> 2011-02-06T19:45:35.222224+01:00 phy005 kernel: CPU 4
>> 2011-02-06T19:45:35.222229+01:00 phy005 kernel: Modules linked in: tun
>> ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding
>> xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter
>> ip6_tables ipv6 kvm_intel kvm i2c_i801 i2c_core iTCO_wdt serio_raw igb
>> iTCO_vendor_support joydev ioatdma dca 3w_9xxx [last unloaded:
>> scsi_wait_scan]
>> 2011-02-06T19:45:35.222231+01:00 phy005 kernel:
>> 2011-02-06T19:45:35.222233+01:00 phy005 kernel: Pid: 3650, comm:
>> qemu-kvm Not tainted 2.6.34.7-66.tilaa.fc13.x86_64 #1 X8DTU/X8DTU
>> 2011-02-06T19:45:35.222236+01:00 phy005 kernel: RIP:
>> 0010:[<ffffffff81034880>] Â[<ffffffff81034880>]
>> gup_pte_range+0x94/0xd3
>> 2011-02-06T19:45:35.222239+01:00 phy005 kernel: RSP:
>> 0018:ffff88060b9bda78 ÂEFLAGS: 00010082
>> 2011-02-06T19:45:35.222241+01:00 phy005 kernel: RAX: ffffea71929180e0
>> RBX: 00003ffffffff000 RCX: 0000000000000005
>> 2011-02-06T19:45:35.222243+01:00 phy005 kernel: RDX: 00007fe54e400000
>> RSI: 00007fe54e3ff000 RDI: 1603a07305004067
>> 2011-02-06T19:45:35.222245+01:00 phy005 kernel: RBP: ffff88060b9bda98
>> R08: ffff880b94384560 R09: ffff88060b9bdb44
>> 2011-02-06T19:45:35.222248+01:00 phy005 kernel: R10: ffff880606b2fff8
>> R11: ffffea0000000000 R12: 0000000000000205
>> 2011-02-06T19:45:35.222251+01:00 phy005 kernel: R13: ffffc00000000fff
>> R14: 0000000000000005 R15: 0000000000000000
>> 2011-02-06T19:45:35.222255+01:00 phy005 kernel: FS:
>> 00007fe64cb0e700(0000) GS:ffff880655400000(0000)
>> knlGS:0000000000000000
>> 2011-02-06T19:45:35.222259+01:00 phy005 kernel: CS: Â0010 DS: 002b ES:
>> 002b CR0: 0000000080050033
>> 2011-02-06T19:45:35.222263+01:00 phy005 kernel: CR2: ffffea71929180e0
>> CR3: 0000000bff06d000 CR4: 00000000000026e0
>> 2011-02-06T19:45:35.222267+01:00 phy005 kernel: DR0: 0000000000000000
>> DR1: 0000000000000000 DR2: 0000000000000000
>> 2011-02-06T19:45:35.222271+01:00 phy005 kernel: DR3: 0000000000000000
>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> 2011-02-06T19:45:35.222274+01:00 phy005 kernel: Process qemu-kvm (pid:
>> 3650, threadinfo ffff88060b9bc000, task ffff880623ed2ee0)
>> 2011-02-06T19:45:35.222278+01:00 phy005 kernel: Stack:
>> 2011-02-06T19:45:35.222281+01:00 phy005 kernel: 00007fe54e400000
>> 00007fe54e400000 00007fe54e400000 ffff88053a0d2388
>> 2011-02-06T19:45:35.222285+01:00 phy005 kernel:<0> Âffff88060b9bdaf8
>> ffffffff81034a15 00007fe54e3fffff 00007fe54e3fffff
>> 2011-02-06T19:45:35.222289+01:00 phy005 kernel:<0> Âffff88060b9bdb44
>> ffff880b94384560 ffff880bff06eca8 ffff880bff06d7f8
>> 2011-02-06T19:45:35.222292+01:00 phy005 kernel: Call Trace:
>> 2011-02-06T19:45:35.222296+01:00 phy005 kernel: [<ffffffff81034a15>]
>> gup_pud_range+0x156/0x192
>> 2011-02-06T19:45:35.222300+01:00 phy005 kernel: [<ffffffff81034b15>]
>> get_user_pages_fast+0xc4/0x172
>> 2011-02-06T19:45:35.222304+01:00 phy005 kernel: [<ffffffff81131fbc>] ?
>> bio_add_page+0x36/0x38
>> 2011-02-06T19:45:35.222308+01:00 phy005 kernel: [<ffffffff81134730>]
>> dio_get_page+0x54/0x127
>> 2011-02-06T19:45:35.222312+01:00 phy005 kernel: [<ffffffff81135317>]
>> __blockdev_direct_IO+0x41d/0xa36
>> 2011-02-06T19:45:35.222316+01:00 phy005 kernel: [<ffffffffa0080f69>] ?
>> x86_emulate_insn+0x1ff8/0x2d61 [kvm]
>> 2011-02-06T19:45:35.222320+01:00 phy005 kernel: [<ffffffff8113379b>]
>> blkdev_direct_IO+0x4e/0x50
>> 2011-02-06T19:45:35.222324+01:00 phy005 kernel: [<ffffffff81132c49>] ?
>> blkdev_get_blocks+0x0/0x8d
>> 2011-02-06T19:45:35.222328+01:00 phy005 kernel: [<ffffffff810cb516>]
>> generic_file_direct_write+0xed/0x16d
>> 2011-02-06T19:45:35.222331+01:00 phy005 kernel: [<ffffffff810cb72c>]
>> __generic_file_aio_write+0x196/0x281
>> 2011-02-06T19:45:35.222335+01:00 phy005 kernel: [<ffffffff811d5352>] ?
>> file_has_perm+0xa4/0xc6
>> 2011-02-06T19:45:35.222339+01:00 phy005 kernel: [<ffffffff81133043>] ?
>> blkdev_aio_write+0x0/0x69
>> 2011-02-06T19:45:35.222343+01:00 phy005 kernel: [<ffffffff8113306d>]
>> blkdev_aio_write+0x2a/0x69
>> 2011-02-06T19:45:35.222347+01:00 phy005 kernel: [<ffffffff81133043>] ?
>> blkdev_aio_write+0x0/0x69
>> 2011-02-06T19:45:35.222351+01:00 phy005 kernel: [<ffffffff8113d4eb>]
>> aio_rw_vect_retry+0x85/0x18e
>> 2011-02-06T19:45:35.222355+01:00 phy005 kernel: [<ffffffff8113e9b3>]
>> aio_run_iocb+0x77/0x10f
>> 2011-02-06T19:45:35.222359+01:00 phy005 kernel: [<ffffffff8113f508>]
>> do_io_submit+0x558/0x7ce
>> 2011-02-06T19:45:35.222363+01:00 phy005 kernel: [<ffffffff8113f78e>]
>> sys_io_submit+0x10/0x12
>> 2011-02-06T19:45:35.222366+01:00 phy005 kernel: [<ffffffff81009c72>]
>> system_call_fastpath+0x16/0x1b
>> 2011-02-06T19:45:35.222372+01:00 phy005 kernel: Code: 21 d8 49 01 c2
>> 49 8b 3a 49 89 fe 4d 21 ee 4d 21 e6 49 39 ce 75 49 48 89 f8 0f 1f 40
>> 00 48 21 d8 48 c1 e8 0c 48 6b c0 38 4c 01 d8<66> Â83 38 00 48 89 c7 79
>> 04 48 8b 78 10 f0 ff 47 08 49 63 39 48
>> 2011-02-06T19:45:35.222376+01:00 phy005 kernel: RIP
>> [<ffffffff81034880>] gup_pte_range+0x94/0xd3
>> 2011-02-06T19:45:35.222379+01:00 phy005 kernel: RSP<ffff88060b9bda78>
>> 2011-02-06T19:45:35.222382+01:00 phy005 kernel: CR2: ffffea71929180e0
>> 2011-02-06T19:45:35.222386+01:00 phy005 kernel: ---[ end trace
>> beed2b54d0bb8a00 ]---
>>
>
> Hm, outside any kvm code.
>
>> and
>>
>> 2011-02-06T19:47:15.023129+01:00 phy005 kernel: qemu-kvm: Corrupted
>> page table at address 7fbde15ff64c
>> 2011-02-06T19:47:15.023207+01:00 phy005 kernel: PGD 5ff58a067 PUD
>> 612668067 PMD 5937b7067 PE 1603a07305008067
>
> Again outside kvm, and again the magic pte 1603axxxxx.
>
>
>> followed by
>>
>> 2011-02-06T21:20:32.882972+01:00 phy005 kernel: BUG: unable to handle
>> kernel paging request at fffff6b192918010
>> 2011-02-06T21:20:32.883252+01:00 phy005 kernel: IP:
>> [<ffffffffa0078826>] kvm_mmu_zap_page+0x28a/0x299 [kvm]
>
> Well, after something goes bad, nothing good can come out of it.
>
>> after which we rebooted the machine and replaced the motherboard and
>> cpus (we already replaced the memory before).
>>
>> But 2 days ago we got this oops:
>>
>> 2011-02-08T15:56:19.902104+01:00 phy005 kernel: BUG: unable to handle
>> kernel paging request at ffffea71929181c0
>> 2011-02-08T15:56:19.902686+01:00 phy005 kernel: IP:
>> [<ffffffff81034880>] gup_pte_range+0x94/0xd3
>> 2011-02-08T15:56:19.902693+01:00 phy005 kernel: PGD 118600067 PUD 0
>> 2011-02-08T15:56:19.902699+01:00 phy005 kernel: Oops: 0000 [#1] SMP
>> 2011-02-08T15:56:19.902703+01:00 phy005 kernel: last sysfs file:
>> /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_m
>> ap
>> 2011-02-08T15:56:19.902708+01:00 phy005 kernel: CPU 8
>> 2011-02-08T15:56:19.902715+01:00 phy005 kernel: Modules linked in: tun
>> ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q
>> Âgarp stp llc bonding xt_comment xt_recent ip6t_REJECT
>> nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm i
>> gb i2c_i801 iTCO_wdt ioatdma i2c_core iTCO_vendor_support dca
>> serio_raw joydev 3w_9xxx [last unloaded: scsi_wait_scan]
>> 2011-02-08T15:56:19.902770+01:00 phy005 kernel:
>> 2011-02-08T15:56:19.902775+01:00 phy005 kernel: Pid: 3346, comm:
>> qemu-kvm Not tainted 2.6.34.7-66.tilaa.fc13.x86_64 #1 X
>> 8DTU/X8DTU
>> 2011-02-08T15:56:19.902781+01:00 phy005 kernel: RIP:
>> 0010:[<ffffffff81034880>] Â[<ffffffff81034880>] gup_pte_range+0x94/
>> 0xd3
>> 2011-02-08T15:56:19.902785+01:00 phy005 kernel: RSP:
>> 0018:ffff880c21bc1a78 ÂEFLAGS: 00010086
>> 2011-02-08T15:56:19.902789+01:00 phy005 kernel: RAX: ffffea71929181c0
>> RBX: 00003ffffffff000 RCX: 0000000000000005
>> 2011-02-08T15:56:19.902793+01:00 phy005 kernel: RDX: 00007fa2ca200000
>> RSI: 00007fa2ca1ff000 RDI: 1603a07305008067
>> 2011-02-08T15:56:19.902797+01:00 phy005 kernel: RBP: ffff880c21bc1a98
>> R08: ffff88060fdfad60 R09: ffff880c21bc1b44
>> 2011-02-08T15:56:19.902801+01:00 phy005 kernel: R10: ffff88061493fff8
>> R11: ffffea0000000000 R12: 0000000000000205
>> 2011-02-08T15:56:19.902805+01:00 phy005 kernel: R13: ffffc00000000fff
>> R14: 0000000000000005 R15: 0000000000000000
>> 2011-02-08T15:56:19.902810+01:00 phy005 kernel: FS:
>> 00007fa2d8724700(0000) GS:ffff880002080000(0000) knlGS:000000000000
>> 0000
>> 2011-02-08T15:56:19.902820+01:00 phy005 kernel: CS: Â0010 DS: 002b ES:
>> 002b CR0: 0000000080050033
>> 2011-02-08T15:56:19.902825+01:00 phy005 kernel: CR2: ffffea71929181c0
>> CR3: 0000000c231f9000 CR4: 00000000000026e0
>> 2011-02-08T15:56:19.902829+01:00 phy005 kernel: DR0: 0000000000000000
>> DR1: 0000000000000000 DR2: 0000000000000000
>> 2011-02-08T15:56:19.902833+01:00 phy005 kernel: DR3: 0000000000000000
>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> 2011-02-08T15:56:19.902837+01:00 phy005 kernel: Process qemu-kvm (pid:
>> 3346, threadinfo ffff880c21bc0000, task ffff880c2
>> 264ddc0)
>> 2011-02-08T15:56:19.902841+01:00 phy005 kernel: Stack:
>> 2011-02-08T15:56:19.902844+01:00 phy005 kernel: 00007fa2ca200000
>> 00007fa2ca201000 00007fa2ca201000 ffff880c22c3d280
>> 2011-02-08T15:56:19.902848+01:00 phy005 kernel:<0> Âffff880c21bc1af8
>> ffffffff81034a15 00007fa2ca200fff 00007fa2ca200fff
>> 2011-02-08T15:56:19.902852+01:00 phy005 kernel:<0> Âffff880c21bc1b44
>> ffff88060fdfad60 ffff880c2231a458 ffff880c231f97f8
>> 2011-02-08T15:56:19.902855+01:00 phy005 kernel: Call Trace:
>> 2011-02-08T15:56:19.902859+01:00 phy005 kernel: [<ffffffff81034a15>]
>> gup_pud_range+0x156/0x192
>> 2011-02-08T15:56:19.902863+01:00 phy005 kernel: [<ffffffff81034b15>]
>> get_user_pages_fast+0xc4/0x172
>> 2011-02-08T15:56:19.902867+01:00 phy005 kernel: [<ffffffff81131fbc>] ?
>> bio_add_page+0x36/0x38
>> 2011-02-08T15:56:19.902871+01:00 phy005 kernel: [<ffffffff81134730>]
>> dio_get_page+0x54/0x127
>> 2011-02-08T15:56:19.902875+01:00 phy005 kernel: [<ffffffff81135317>]
>> __blockdev_direct_IO+0x41d/0xa36
>> 2011-02-08T15:56:19.902880+01:00 phy005 kernel: [<ffffffffa008bf69>] ?
>> x86_emulate_insn+0x1ff8/0x2d61 [kvm]
>> 2011-02-08T15:56:19.902884+01:00 phy005 kernel: [<ffffffff8113379b>]
>> blkdev_direct_IO+0x4e/0x50
>> 2011-02-08T15:56:19.902888+01:00 phy005 kernel: [<ffffffff81132c49>] ?
>> blkdev_get_blocks+0x0/0x8d
>> 2011-02-08T15:56:19.902892+01:00 phy005 kernel: [<ffffffff810cb516>]
>> generic_file_direct_write+0xed/0x16d
>> 2011-02-08T15:56:19.902896+01:00 phy005 kernel: [<ffffffff810cb72c>]
>> __generic_file_aio_write+0x196/0x281
>> 2011-02-08T15:56:19.902899+01:00 phy005 kernel: [<ffffffff81133043>] ?
>> blkdev_aio_write+0x0/0x69
>> 2011-02-08T15:56:19.902909+01:00 phy005 kernel: [<ffffffff81133043>] ?
>> blkdev_aio_write+0x0/0x69
>> 2011-02-08T15:56:19.902914+01:00 phy005 kernel: [<ffffffff8113d4eb>]
>> aio_rw_vect_retry+0x85/0x18e
>> 2011-02-08T15:56:19.902919+01:00 phy005 kernel: [<ffffffff8113e9b3>]
>> aio_run_iocb+0x77/0x10f
>> 2011-02-08T15:56:19.902923+01:00 phy005 kernel: [<ffffffff8113f508>]
>> do_io_submit+0x558/0x7ce
>> 2011-02-08T15:56:19.902927+01:00 phy005 kernel: [<ffffffff8113f78e>]
>> sys_io_submit+0x10/0x12
>> 2011-02-08T15:56:19.902932+01:00 phy005 kernel: [<ffffffff81009c72>]
>> system_call_fastpath+0x16/0x1b
>> 2011-02-08T15:56:19.902938+01:00 phy005 kernel: Code: 21 d8 49 01 c2
>> 49 8b 3a 49 89 fe 4d 21 ee 4d 21 e6 49 39 ce 75 49 48 89 f8 0f 1f 40
>> 00 48 21 d8 48 c1 e8 0c 48 6b c0 38 4c 01 d8<66> Â83 38 00 48 89 c7 79
>> 04 48 8b 78 10 f0 ff 47 08 49 63 39 48
>> 2011-02-08T15:56:19.903077+01:00 phy005 kernel: RIP
>> [<ffffffff81034880>] gup_pte_range+0x94/0xd3
>> 2011-02-08T15:56:19.903081+01:00 phy005 kernel: RSP<ffff880c21bc1a78>
>> 2011-02-08T15:56:19.903084+01:00 phy005 kernel: CR2: ffffea71929181c0
>> 2011-02-08T15:56:19.903088+01:00 phy005 kernel: ---[ end trace
>> 174c28940e9fd0a7 ]---
>>
>
> Again outside kvm.
>
>> and yesterday this one:
>>
>> 2011-02-09T07:40:15.636528+01:00 phy005 kernel: BUG: unable to handle
>> kernel NULL pointer dereference at (null)
>> 2011-02-09T07:40:15.636635+01:00 phy005 kernel: IP:
>> [<ffffffffa0082db8>] gfn_to_rmap+0x20/0x6e [kvm]
>> 2011-02-09T07:40:15.636639+01:00 phy005 kernel: PGD 0
>> 2011-02-09T07:40:15.636643+01:00 phy005 kernel: Oops: 0000 [#3] SMP
>> 2011-02-09T07:40:15.636647+01:00 phy005 kernel: last sysfs file:
>> /sys/devices/system/cpu/cpu15/topology/thread_siblings
>> 2011-02-09T07:40:15.636650+01:00 phy005 kernel: CPU 2
>> 2011-02-09T07:40:15.636656+01:00 phy005 kernel: Modules linked in: tun
>> ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding
>> xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter
>> ip6_tables ipv6 kvm_intel kvm igb i2c_i801 iTCO_wdt ioatdma i2c_core
>> iTCO_vendor_support dca serio_raw joydev 3w_9xxx [last unloaded:
>> scsi_wait_scan]
>> 2011-02-09T07:40:15.636663+01:00 phy005 kernel:
>> 2011-02-09T07:40:15.636666+01:00 phy005 kernel: Pid: 2572, comm:
>> qemu-kvm Tainted: G Â Â ÂD Â Â2.6.34.7-66.tilaa.fc13.x86_64 #1
>> X8DTU/X8DTU
>> 2011-02-09T07:40:15.636670+01:00 phy005 kernel: RIP:
>> 0010:[<ffffffffa0082db8>] Â[<ffffffffa0082db8>] gfn_to_rmap+0x20/0x6e
>> [kvm]
>> 2011-02-09T07:40:15.636673+01:00 phy005 kernel: RSP:
>> 0018:ffff88061cbcbcd8 ÂEFLAGS: 00010246
>> 2011-02-09T07:40:15.636677+01:00 phy005 kernel: RAX: 0000000000000000
>> RBX: 1603a07305004fff RCX: ffff88061cbcbd08
>> 2011-02-09T07:40:15.636680+01:00 phy005 kernel: RDX: 0000000000000023
>> RSI: 1603a07305004fff RDI: 0000000000000000
>> 2011-02-09T07:40:15.636683+01:00 phy005 kernel: RBP: ffff88061cbcbce8
>> R08: 0000000000000023 R09: 0000000000000000
>> 2011-02-09T07:40:15.636686+01:00 phy005 kernel: R10: 0000000000000000
>> R11: ffffffffa0082c7f R12: 0000000000000001
>> 2011-02-09T07:40:15.636689+01:00 phy005 kernel: R13: 0000000000311763
>> R14: ffff8809b8b01ce0 R15: 0000000000000000
>> 2011-02-09T07:40:15.636692+01:00 phy005 kernel: FS:
>> 0000000000000000(0000) GS:ffff880002040000(0000)
>> knlGS:0000000000000000
>> 2011-02-09T07:40:15.636695+01:00 phy005 kernel: CS: Â0010 DS: 0000 ES:
>> 0000 CR0: 000000008005003b
>> 2011-02-09T07:40:15.636699+01:00 phy005 kernel: CR2: 0000000000000000
>> CR3: 0000000001a42000 CR4: 00000000000026e0
>> 2011-02-09T07:40:15.636702+01:00 phy005 kernel: DR0: 0000000000000000
>> DR1: 0000000000000000 DR2: 0000000000000000
>> 2011-02-09T07:40:15.636705+01:00 phy005 kernel: DR3: 0000000000000000
>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> 2011-02-09T07:40:15.636709+01:00 phy005 kernel: Process qemu-kvm (pid:
>> 2572, threadinfo ffff88061cbca000, task ffff88061cf04650)
>> 2011-02-09T07:40:15.636711+01:00 phy005 kernel: Stack:
>> 2011-02-09T07:40:15.636715+01:00 phy005 kernel: ffff88036c471ff8
>> ffff880c23984000 ffff88061cbcbd18 ffffffffa0082ea9
>> 2011-02-09T07:40:15.636718+01:00 phy005 kernel:<0> Âffff8809b8b01ce0
>> ffff880c23984000 ffff88036c471ff8 00000000000001ff
>> 2011-02-09T07:40:15.636721+01:00 phy005 kernel:<0> Âffff88061cbcbd58
>> ffffffffa008363b 0000000000000200 ffff880c23984000
>> 2011-02-09T07:40:15.636724+01:00 phy005 kernel: Call Trace:
>> 2011-02-09T07:40:15.636728+01:00 phy005 kernel: [<ffffffffa0082ea9>]
>> rmap_remove+0xa3/0x1a0 [kvm]
>> 2011-02-09T07:40:15.636731+01:00 phy005 kernel: [<ffffffffa008363b>]
>> kvm_mmu_zap_page+0x9f/0x299 [kvm]
>> 2011-02-09T07:40:15.636734+01:00 phy005 kernel: [<ffffffffa0083a42>]
>> kvm_mmu_zap_all+0x35/0x60 [kvm]
>> 2011-02-09T07:40:15.636738+01:00 phy005 kernel: [<ffffffffa0078cde>]
>> kvm_arch_flush_shadow+0x16/0x22 [kvm]
>> 2011-02-09T07:40:15.636741+01:00 phy005 kernel: [<ffffffffa006eb0a>]
>> kvm_mmu_notifier_release+0x31/0x44 [kvm]
>> 2011-02-09T07:40:15.636744+01:00 phy005 kernel: [<ffffffff810fac37>]
>> __mmu_notifier_release+0x4f/0x7b
>> 2011-02-09T07:40:15.636748+01:00 phy005 kernel: [<ffffffff810e735d>]
>> exit_mmap+0x2c/0x132
>> 2011-02-09T07:40:15.636751+01:00 phy005 kernel: [<ffffffff8104ad7a>]
>> mmput+0x5e/0xca
>> 2011-02-09T07:40:15.636754+01:00 phy005 kernel: [<ffffffff8104f0d5>]
>> exit_mm+0x114/0x121
>> 2011-02-09T07:40:15.636757+01:00 phy005 kernel: [<ffffffff81050bf5>]
>> do_exit+0x254/0x752
>> 2011-02-09T07:40:15.636760+01:00 phy005 kernel: [<ffffffff8100a60e>] ?
>> apic_timer_interrupt+0xe/0x20
>> 2011-02-09T07:40:15.636764+01:00 phy005 kernel: [<ffffffff81051174>]
>> do_group_exit+0x81/0xab
>> 2011-02-09T07:40:15.636767+01:00 phy005 kernel: [<ffffffff810511b5>]
>> sys_exit_group+0x17/0x1b
>> 2011-02-09T07:40:15.636771+01:00 phy005 kernel: [<ffffffff81009c72>]
>> system_call_fastpath+0x16/0x1b
>> 2011-02-09T07:40:15.636777+01:00 phy005 kernel: Code: 88 ff ff ff b8
>> 01 00 00 00 c9 c3 55 48 89 e5 41 54 53 0f 1f 44 00 00 41 89 d4 48 89
>> f3 e8 7b c7 fe ff 41 83 fc 01 48 89 c7 75 0d<48> Â2b 18 48 c1 e3 03 48
>> 03 58 18 eb 39 41 8d 4c 24 ff be 01 00
>> 2011-02-09T07:40:15.636785+01:00 phy005 kernel: RIP
>> [<ffffffffa0082db8>] gfn_to_rmap+0x20/0x6e [kvm]
>> 2011-02-09T07:40:15.636788+01:00 phy005 kernel: RSP<ffff88061cbcbcd8>
>> 2011-02-09T07:40:15.636791+01:00 phy005 kernel: CR2: 0000000000000000
>> 2011-02-09T07:40:15.637743+01:00 phy005 kernel: ---[ end trace
>> 174c28940e9fd0a9 ]---
>> 2011-02-09T07:40:15.637751+01:00 phy005 kernel: Fixing recursive fault
>> but reboot is needed!
>>
>
> In kvm. ÂWas there a reboot between the two?

No, there wasn't. I've just looked back at the logs and there was
another oops in between:

2011-02-09T04:28:01.890999+01:00 phy005 kernel: general protection
fault: 0000 [#2] SMP
2011-02-09T04:28:01.891122+01:00 phy005 kernel: last sysfs file:
/sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_m
ap
2011-02-09T04:28:01.891127+01:00 phy005 kernel: CPU 12
2011-02-09T04:28:01.891137+01:00 phy005 kernel: Modules linked in: tun
ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q
 garp stp llc bonding xt_comment xt_recent ip6t_REJECT
nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm i
gb i2c_i801 iTCO_wdt ioatdma i2c_core iTCO_vendor_support dca
serio_raw joydev 3w_9xxx [last unloaded: scsi_wait_scan]
2011-02-09T04:28:01.891144+01:00 phy005 kernel:
2011-02-09T04:28:01.891148+01:00 phy005 kernel: Pid: 19782, comm: find
Tainted: G      D    2.6.34.7-66.tilaa.fc13.x86_6
4 #1 X8DTU/X8DTU
2011-02-09T04:28:01.891154+01:00 phy005 kernel: RIP:
0010:[<ffffffff81158aa4>]  [<ffffffff81158aa4>] proc_fd_instantiate
+0x88/0x127
2011-02-09T04:28:01.891157+01:00 phy005 kernel: RSP:
0018:ffff880245677da8  EFLAGS: 00010206
2011-02-09T04:28:01.891161+01:00 phy005 kernel: RAX: 1603a07305000000
RBX: ffff8808076ada40 RCX: ffff88058bbbddc0
2011-02-09T04:28:01.891164+01:00 phy005 kernel: RDX: 000000000000022a
RSI: ffff8808076ada40 RDI: ffff88062293ee80
2011-02-09T04:28:01.891168+01:00 phy005 kernel: RBP: ffff880245677dc8
R08: ffff8808076a91d0 R09: ffffffff81158a1c
2011-02-09T04:28:01.891172+01:00 phy005 kernel: R10: 0000000000000002
R11: ffff880245677d08 R12: ffff88062293ee00
2011-02-09T04:28:01.891176+01:00 phy005 kernel: R13: ffff8805b3897bf8
R14: ffff8808076a9430 R15: ffff8807ddd76c00
2011-02-09T04:28:01.891180+01:00 phy005 kernel: FS:
00007f09aa8e07a0(0000) GS:ffff880655480000(0000) knlGS:000000000000
0000
2011-02-09T04:28:01.891184+01:00 phy005 kernel: CS:  0010 DS: 0000 ES:
0000 CR0: 0000000080050033
2011-02-09T04:28:01.891188+01:00 phy005 kernel: CR2: 0000000000e43080
CR3: 00000007d6d6c000 CR4: 00000000000026e0
2011-02-09T04:28:01.891192+01:00 phy005 kernel: DR0: 0000000000000000
DR1: 0000000000000000 DR2: 0000000000000000
2011-02-09T04:28:01.891196+01:00 phy005 kernel: DR3: 0000000000000000
DR6: 00000000ffff0ff0 DR7: 0000000000000400
2011-02-09T04:28:01.891199+01:00 phy005 kernel: Process find (pid:
19782, threadinfo ffff880245676000, task ffff88058bbb
8000)
2011-02-09T04:28:01.891202+01:00 phy005 kernel: Stack:
2011-02-09T04:28:01.891206+01:00 phy005 kernel: ffff880245677e78
0000000000000003 ffff8802bfe0af00 ffff8808076ada40
2011-02-09T04:28:01.891209+01:00 phy005 kernel: <0> ffff880245677e38
ffffffff811564b8 ffff880245677e38 ffffffff81158a1c
2011-02-09T04:28:01.891213+01:00 phy005 kernel: <0> ffffffff8111b530
ffff880245677f38 0000000300119d45 ffff880245677e78
2011-02-09T04:28:01.891216+01:00 phy005 kernel: Call Trace:
2011-02-09T04:28:01.891220+01:00 phy005 kernel: [<ffffffff811564b8>]
proc_fill_cache+0xa7/0x13f
2011-02-09T04:28:01.891224+01:00 phy005 kernel: [<ffffffff81158a1c>] ?
proc_fd_instantiate+0x0/0x127
2011-02-09T04:28:01.891227+01:00 phy005 kernel: [<ffffffff8111b530>] ?
filldir+0x0/0xd0
2011-02-09T04:28:01.891231+01:00 phy005 kernel: [<ffffffff8111b530>] ?
filldir+0x0/0xd0
2011-02-09T04:28:01.891235+01:00 phy005 kernel: [<ffffffff811586c8>]
proc_readfd_common+0x159/0x1a3
2011-02-09T04:28:01.891239+01:00 phy005 kernel: [<ffffffff81158a1c>] ?
proc_fd_instantiate+0x0/0x127
2011-02-09T04:28:01.891242+01:00 phy005 kernel: [<ffffffff8111b530>] ?
filldir+0x0/0xd0
2011-02-09T04:28:01.891246+01:00 phy005 kernel: [<ffffffff8115873e>]
proc_readfd+0x15/0x17
2011-02-09T04:28:01.891250+01:00 phy005 kernel: [<ffffffff8111b731>]
vfs_readdir+0x77/0xb4
2011-02-09T04:28:01.891254+01:00 phy005 kernel: [<ffffffff8111b8b7>]
sys_getdents+0x81/0xd1
2011-02-09T04:28:01.891258+01:00 phy005 kernel: [<ffffffff81009c72>]
system_call_fastpath+0x16/0x1b
2011-02-09T04:28:01.891263+01:00 phy005 kernel: Code: e8 08 3e 2f 00
49 8b 44 24 08 44 3b 28 0f 83 9c 00 00 00 45 89 ed 49 c1 e5 03 4c 03
68 08 49 8b 45 00 48 85 c0 0f 84 84 00 00 00 <f6> 40 3c 01 74 0a 66 41
81 8e aa 00 00 00 40 01 f6 40 3c 02 74
2011-02-09T04:28:01.891275+01:00 phy005 kernel: RIP
[<ffffffff81158aa4>] proc_fd_instantiate+0x88/0x127
2011-02-09T04:28:01.891279+01:00 phy005 kernel: RSP <ffff880245677da8>
2011-02-09T04:28:01.891283+01:00 phy005 kernel: ---[ end trace
174c28940e9fd0a8 ]---

>
>> So it doesn't seem to be a hardware problem since we replaced all that.
>
> I agree. ÂAnd your other machines are stable?

Yes, the other ones have been running for ages without problems.
We've been using 2.6.34.7 for about three months now.

> When you say "identical software", are those exactly the same binaries?

Yes, the same (kickstarted) install, the same rpms.

> copying Andrea for possible insight into the non-kvm oopses.
>
> --
> error compiling committee.c: too many arguments to function

Kind regards,

Ruben
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux