https://bugzilla.redhat.com/show_bug.cgi?id=975065 ? On Mon, Jul 29, 2013 at 11:02:01AM +0200, Artur Samborski wrote: > Hello, > > we have another problem with KVM on our production machines. > > After updating the OS (Fedora Core 18) our KVM virtual machines > started to crash. Test have shown that this crashes are associated > with occurrence of a large load of network traffic. > > When the virtual machine hangs, this message appears in the KVM-host > kernel (3.9.9-201.fc18.x86_64) log: > > > BUG: unable to handle kernel NULL pointer dereference at (null) > IP: [<ffffffff81141af1>] put_page+0x11/0x60 > PGD 0 > Oops: 0000 [#1] SMP > Modules linked in: binfmt_misc ip6table_filter ip6_tables > ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat > nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack > xt_CHECKSUM iptable_mangle bridge stp llc be2iscsi iscsi_boot_sysfs > bnx2i cnic uio cxgb4i cxgb4 cxgb3i cxgb3 mdio libcxgbi ib_iser > rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad ib_core iscsi_tcp > libiscsi_tcp libiscsi scsi_transport_iscsi e1000e iTCO_wdt > iTCO_vendor_support ptp pps_core vhost_net ses ioatdma dcdbas mperf > shpchp i7core_edac lpc_ich edac_core dca mfd_core tun macvtap > macvlan enclosure bnx2 coretemp crc32c_intel serio_raw microcode > kvm_intel acpi_power_meter kvm ipmi_devintf ipmi_si ipmi_msghandler > mgag200 i2c_algo_bit drm_kms_helper ttm drm i2c_core megaraid_sas > wmi > CPU 2 > Pid: 7524, comm: vhost-7521 Tainted: G W I > 3.9.9-201.fc18.x86_64 #1 Dell Inc. PowerEdge R610/0K399H > RIP: 0010:[<ffffffff81141af1>] [<ffffffff81141af1>] put_page+0x11/0x60 > RSP: 0018:ffff880427a31c28 EFLAGS: 00010296 > RAX: ffff88065d8e16c0 RBX: 0000000000000000 RCX: 0000000000000006 > RDX: 0000000000000150 RSI: 0000000000000000 RDI: 0000000000000000 > RBP: ffff880427a31c38 R08: 000000000000000a R09: 00000000000006f7 > R10: 0000000000000000 R11: 00000000000006f6 R12: ffff8808273c9d00 > R13: ffffffffa0180237 R14: ffff88067c8d43d8 R15: ffff8808273c9d00 > FS: 0000000000000000(0000) GS:ffff88083fc20000(0000) > knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 0000000000000000 CR3: 000000082864a000 CR4: 00000000000027e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process vhost-7521 (pid: 7524, threadinfo ffff880427a30000, task > ffff880427f94650) > Stack: > ffffea001c159340 0000000000000013 ffff880427a31c58 ffffffff8154676f > ffff8808273c9d00 ffff8808273c9d00 ffff880427a31c78 ffffffff8154680e > ffffea001c15b080 ffff880828f45800 ffff880427a31ca8 ffffffff815468c6 > Call Trace: > [<ffffffff8154676f>] skb_release_data+0x8f/0x110 > [<ffffffff8154680e>] __kfree_skb+0x1e/0xa0 > [<ffffffff815468c6>] kfree_skb+0x36/0xa0 > [<ffffffffa0180237>] macvtap_get_user+0x317/0x510 [macvtap] > [<ffffffffa018045b>] macvtap_sendmsg+0x2b/0x30 [macvtap] > [<ffffffffa0258db7>] handle_tx+0x287/0x680 [vhost_net] > [<ffffffffa02591e5>] handle_tx_kick+0x15/0x20 [vhost_net] > [<ffffffffa025595d>] vhost_worker+0xed/0x190 [vhost_net] > [<ffffffffa0255870>] ? vhost_work_flush+0x110/0x110 [vhost_net] > [<ffffffff81082ba0>] kthread+0xc0/0xd0 > [<ffffffff81010000>] ? ftrace_define_fields_xen_mc_flush+0x20/0xb0 > [<ffffffff81082ae0>] ? kthread_create_on_node+0x120/0x120 > [<ffffffff8166af2c>] ret_from_fork+0x7c/0xb0 > [<ffffffff81082ae0>] ? kthread_create_on_node+0x120/0x120 > Code: 45 fc 65 48 01 04 25 70 02 01 00 c9 c3 66 66 66 66 2e 0f 1f > 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 53 48 89 fb 48 83 ec 08 > <48> f7 07 00 c0 00 00 75 34 8b 47 1c 85 c0 74 1a f0 ff 4b 1c 0f > RIP [<ffffffff81141af1>] put_page+0x11/0x60 > RSP <ffff880427a31c28> > CR2: 0000000000000000 > ---[ end trace cb305c3097c1de97 ]--- > > > After returning to the previously working kernel (3.7.0 -- manually > compiled from kvm git sources) - the problem still persists: > > > BUG: unable to handle kernel paging request at 0000040200000401 > IP: [<ffffffff8113e445>] put_page+0x5/0x50 > PGD 0 > Oops: 0000 [#1] SMP > Modules linked in: binfmt_misc ip6table_filter ip6_tables > ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack > nf_conntrack xt_CHECKSUM iptable_mangle be2iscsi iscsi_boot_sysfs > bnx2i cnic uio cxgb4i cxgb4 cxgb3i cxgb3 mdio libcxgbi ib_iser > rdma_cm ib_addr iw_cm ib_cm ib_sa bridge stp llc ib_mad ib_core > iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi vhost_net > coretemp e1000e ioatdma tun macvtap macvlan bnx2 iTCO_wdt shpchp > crc32c_intel microcode dca ses iTCO_vendor_support lpc_ich wmi > dcdbas kvm_intel i7core_edac edac_core enclosure joydev > acpi_power_meter serio_raw pcspkr mfd_core kvm ipmi_devintf ipmi_si > ipmi_msghandler megaraid_sas > CPU 0 > Pid: 1505, comm: vhost-1502 Tainted: G W 3.7.0HYDRA_02+ > #1 Dell Inc. PowerEdge R610/0K399H > RIP: 0010:[<ffffffff8113e445>] [<ffffffff8113e445>] put_page+0x5/0x50 > RSP: 0018:ffff880823e6bc50 EFLAGS: 00010202 > RAX: ffff88066d34bec0 RBX: 0000000000000012 RCX: ffffea0019cf001c > RDX: 0000000000000140 RSI: 0000000000000246 RDI: 0000040200000401 > RBP: ffff880823e6bc68 R08: ffff880823e444f8 R09: 0000000000000010 > R10: 0000000000000000 R11: 00003ffffffff000 R12: ffff880827d34700 > R13: ffffffffa01371a8 R14: ffff880823e443d8 R15: ffff880827d34700 > FS: 0000000000000000(0000) GS:ffff88083fc00000(0000) > knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 0000040200000401 CR3: 0000000825272000 CR4: 00000000000027e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process vhost-1502 (pid: 1505, threadinfo ffff880823e6a000, task > ffff880825a1c5c0) > Stack: > ffffffff81520c1f ffff880827d34700 ffff880827d34700 ffff880823e6bc88 > ffffffff81520cbe ffffea0019cf69c0 ffff88042a1df400 ffff880823e6bcb8 > ffffffff81520d76 000000000000000c ffff88042a1df400 000000000000000c > Call Trace: > [<ffffffff81520c1f>] ? skb_release_data+0x8f/0x110 > [<ffffffff81520cbe>] __kfree_skb+0x1e/0xa0 > [<ffffffff81520d76>] kfree_skb+0x36/0xa0 > [<ffffffffa01371a8>] macvtap_get_user+0x248/0x490 [macvtap] > [<ffffffffa013741b>] macvtap_sendmsg+0x2b/0x30 [macvtap] > [<ffffffffa0165d2a>] handle_tx+0x28a/0x680 [vhost_net] > [<ffffffffa0166155>] handle_tx_kick+0x15/0x20 [vhost_net] > [<ffffffffa016295d>] vhost_worker+0xed/0x190 [vhost_net] > [<ffffffffa0162870>] ? vhost_work_flush+0x110/0x110 [vhost_net] > [<ffffffff81081750>] kthread+0xc0/0xd0 > [<ffffffff81010000>] ? ftrace_define_fields_xen_mc_entry+0x50/0xf0 > [<ffffffff81081690>] ? kthread_create_on_node+0x120/0x120 > [<ffffffff8163fdac>] ret_from_fork+0x7c/0xb0 > [<ffffffff81081690>] ? kthread_create_on_node+0x120/0x120 > Code: fc 00 00 00 00 e8 ac fe ff ff 48 63 45 fc 65 48 01 04 25 b8 > 06 01 00 c9 c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 > <48> f7 07 00 c0 00 00 55 48 89 e5 75 2a 8b 47 1c 85 c0 74 1e f0 > RIP [<ffffffff8113e445>] put_page+0x5/0x50 > RSP <ffff880823e6bc50> > CR2: 0000040200000401 > > > Only after a complete rollback to the previous state of the system - > everything starts to work properly (the problem disappears). > Therefore suspicion that it may be associated with same userspace > tools? > > I will be grateful for any hints. > > Regards, > Artur Samborski > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html