On Thursday, June 09, 2011 15:11:23 Patrick McHardy wrote: > On 09.06.2011 14:57, Hans Schillstrom wrote: > > Hello > > I have a problem with ip_vs_conn_flush() and expiring timers ... > > After a couple of hours checking locks, I'm still not closer to a solution > > Conntrack differs a bit between 2.6.32 vs .2.6.39 but I don't think that's the reason in this case. > > > > I think the netns cleanup cased this, but I'm not a conntrack expert :) > > > > The dump below is from a back-ported ipvs to 2.6.32.27 > > The extra patches that renamed the cleanup patches is there that I sent to Simon i.e > > __ip_vs_conn_cleanup renamed to ip_vs_conn_net_cleanup etc. > > > > > > [ 532.287410] CPU 3 > > [ 532.287410] Modules linked in: xt_mark xt_conntrack ip_vs_wrr(N) ip_vs_lc(N) xt_tcpudp ip_vs_rr(N) nf_conntrack_ipv6 xt_MARK xt_state xt_CONNMARK xt_connmark xt_multiport nf_conntrack_netlink nfnetlink xt_hmark(N) ip6table_mangle iptable_mangle ip6table_filter iptable_filter ip6_tables ip_tables x_tables nf_conntrack_ipv4 nf_defrag_ipv4 ip_vs(N) nf_conntrack ip6_tunnel tunnel6 tipc(N) nfs fscache af_packet nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs drbd softdog bonding macvlan ipv6 ext3 jbd mbcache loop dm_mod usbhid hid ide_pci_generic piix ide_core ata_generic ata_piix ahci libata hpsa uhci_hcd hpilo xen_platform_pci cdrom pcspkr ehci_hcd cciss tpm_tis tpm tpm_bios bnx2 serio_raw ipmi_si ipmi_msghandler i5k_amb i5000_edac rtc_cmos rtc_core rtc_lib usbcore container e1000e edac_core shpchp pci_hotplug scsi_mod button thermal processor thermal_sys hwmon > > [ 532.350212] Supported: Yes > > [ 532.350212] Pid: 17, comm: netns Tainted: G N 2.6.32.27-0.2.2.2501.1.PTF-evip #1 ProLiant DL380 G5 > > [ 532.386359] RIP: 0010:[<ffffffff8131aaf6>] [<ffffffff8131aaf6>] netlink_has_listeners+0x36/0x40 > > [ 532.386359] RSP: 0018:ffff880005ac3bf0 EFLAGS: 00010246 > > [ 532.386359] RAX: 0000000000000002 RBX: ffff8800345e2ea8 RCX: ffff880005ac3c98 > > [ 532.386359] RDX: ffff88012380d740 RSI: 0000000000000003 RDI: ffff88012819a400 > > [ 532.386359] RBP: ffff880005ac3d28 R08: ffffffffa0641ca8 R09: 0000000000000024 > > [ 532.386359] R10: 0000000000004000 R11: 0000000000000000 R12: ffff8800345e2ea8 > > [ 532.386359] R13: 0000000000000002 R14: 0000000000000004 R15: ffff88012a875fd8 > > [ 532.386359] FS: 0000000000000000(0000) GS:ffff880005ac0000(0000) knlGS:0000000000000000 > > [ 532.386359] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > > [ 532.386359] CR2: 00007f6a0ecba000 CR3: 0000000001804000 CR4: 00000000000406e0 > > [ 532.386359] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > [ 532.386359] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > [ 532.386359] Process netns (pid: 17, threadinfo ffff88012a874000, task ffff88012a872500) > > [ 532.485087] Stack: > > [ 532.485087] ffffffffa063ff92 ffffffff811e2805 ffff880005ac3c98 00000004811e2805 > > [ 532.485087] <0> ffff88011d9bd500 0000000300000000 ffff880005ac3da0 ffffffff8103e402 > > [ 532.485087] <0> ffff880005ac3d40 ffff880005ac3cb0 0000000000013700 0000000000000010 > > [ 532.485087] Call Trace: > > [ 532.485087] [<ffffffffa063ff92>] ctnetlink_conntrack_event+0x92/0x730 [nf_conntrack_netlink] > > [ 532.485087] [<ffffffffa058b274>] death_by_timeout+0xc4/0x190 [nf_conntrack] ## ct->timeout.function(ct->timeout.data); ## > > [ 532.485087] [<ffffffffa05c544d>] ip_vs_conn_drop_conntrack+0x13d/0x360 [ip_vs] > > [ 532.485087] [<ffffffffa05ae30d>] ip_vs_conn_expire+0x12d/0x7d0 [ip_vs] ## expired timer ## > > [ 532.485087] [<ffffffff81059424>] run_timer_softirq+0x174/0x240 > > [ 532.549596] [<ffffffff810544ef>] __do_softirq+0xbf/0x170 > > [ 532.549596] [<ffffffff810040bc>] call_softirq+0x1c/0x30 > > [ 532.549596] [<ffffffff81005d1d>] do_softirq+0x4d/0x80 > > [ 532.549596] [<ffffffff81054791>] local_bh_enable_ip+0xa1/0xb0 ## ct_write_unlock_bh(idx); ## > > [ 532.549596] [<ffffffffa05ac25b>] ip_vs_conn_net_cleanup+0xdb/0x160 [ip_vs] ## ip_vs_flush in-lined ## > > [ 532.576259] [<ffffffffa05afae1>] __ip_vs_cleanup+0x11/0x90 [ip_vs] > > [ 532.576259] [<ffffffff812f840e>] cleanup_net+0x5e/0xb0 > > [ 532.576259] [<ffffffff81061468>] run_workqueue+0xb8/0x140 > > [ 532.594226] [<ffffffff8106158a>] worker_thread+0x9a/0x110 > > [ 532.594226] [<ffffffff81065696>] kthread+0x96/0xb0 > > [ 532.603788] [<ffffffff81003fba>] child_rip+0xa/0x20 > > [ 532.603788] Code: 47 41 8d 4e ff 48 8d 14 80 48 8d 14 50 31 c0 48 c1 e2 03 48 03 15 7b b2 9b 00 3b 4a 3c 48 8b 7a 30 72 02 f3 c3 0f a3 0f 19 c0 c3 <0f> 0b eb fe 66 0f 1f 44 00 00 48 81 ec 98 00 00 00 41 f6 c0 01 > > [ 532.623046] RIP [<ffffffff8131aaf6>] netlink_has_listeners+0x36/0x40 > > This looks like nfnetlink.c excited and destroyed the nfnl socket, but > ip_vs was still holding a reference to a conntrack. When the conntrack > got destroyed it created a ctnetlink event, causing an oops in > netlink_has_listeners when trying to use the destroyed nfnetlink > socket. > > Usually this shouldn't happen since network namespace cleanup > happens in reverse order from registration. In this case the > reason might be that IPVS has no dependencies on conntrack > or ctnetlink and therefore can get loaded first, meaning it > will get cleaned up afterwards. > > Does that make any sense? > Yes, >From what I can see is ip_vs have a dependency on nf_conntrack but not on nf_conntrack_netlink i.e. nf_conntrack is loded first and then ip_vs and last nf_conntrack_netlink It's hard to tell exactly what was going on in user-space when the lxc container get killed.... Basically there is a lot of traffic (and connections) through the container with ipvs inside, - ipvs conntrack support is turned on - iptables with conntrack - conntrackd is running - ~50 iptables rules I'm not sure if it's only IPv4 traffic ... Hmmm... I think I know, the culprit is conntrackd !! (i.e. it causes loading of ct_netlink) conntrackd will definitely get killed before the namespace exit starts I think it is like you describe, I will make some test tomorrow. How to solve this is another question.... Thanks a lot Patrick. Regards Hans Schillstrom -- To unsubscribe from this list: send the line "unsubscribe lvs-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html