Re: ipvs netns exit causes crash in conntrack.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thursday, June 09, 2011 15:11:23 Patrick McHardy wrote:
> On 09.06.2011 14:57, Hans Schillstrom wrote:
> > Hello 
> > I have a problem with ip_vs_conn_flush() and expiring timers ...
> > After a couple of hours checking locks,  I'm still not closer to a solution
> > Conntrack differs a bit between 2.6.32 vs .2.6.39 but I don't think that's the reason in this case.
> > 
> > I think the netns cleanup cased this, but I'm not a conntrack expert :)
> > 
> > The dump below is from a back-ported ipvs to 2.6.32.27 
> > The extra patches that renamed the cleanup patches is there that I sent to Simon i.e
> > __ip_vs_conn_cleanup renamed to ip_vs_conn_net_cleanup  etc.
> > 
> > 
> > [  532.287410] CPU 3 
> > [  532.287410] Modules linked in: xt_mark xt_conntrack ip_vs_wrr(N) ip_vs_lc(N) xt_tcpudp ip_vs_rr(N) nf_conntrack_ipv6 xt_MARK xt_state xt_CONNMARK xt_connmark xt_multiport nf_conntrack_netlink nfnetlink xt_hmark(N) ip6table_mangle iptable_mangle ip6table_filter iptable_filter ip6_tables ip_tables x_tables nf_conntrack_ipv4 nf_defrag_ipv4 ip_vs(N) nf_conntrack ip6_tunnel tunnel6 tipc(N) nfs fscache af_packet nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs drbd softdog bonding macvlan ipv6 ext3 jbd mbcache loop dm_mod usbhid hid ide_pci_generic piix ide_core ata_generic ata_piix ahci libata hpsa uhci_hcd hpilo xen_platform_pci cdrom pcspkr ehci_hcd cciss tpm_tis tpm tpm_bios bnx2 serio_raw ipmi_si ipmi_msghandler i5k_amb i5000_edac rtc_cmos rtc_core rtc_lib usbcore container e1000e edac_core shpchp pci_hotplug scsi_mod button thermal processor thermal_sys hwmon
> > [  532.350212] Supported: Yes
> > [  532.350212] Pid: 17, comm: netns Tainted: G          N  2.6.32.27-0.2.2.2501.1.PTF-evip #1 ProLiant DL380 G5
> > [  532.386359] RIP: 0010:[<ffffffff8131aaf6>]  [<ffffffff8131aaf6>] netlink_has_listeners+0x36/0x40
> > [  532.386359] RSP: 0018:ffff880005ac3bf0  EFLAGS: 00010246
> > [  532.386359] RAX: 0000000000000002 RBX: ffff8800345e2ea8 RCX: ffff880005ac3c98
> > [  532.386359] RDX: ffff88012380d740 RSI: 0000000000000003 RDI: ffff88012819a400
> > [  532.386359] RBP: ffff880005ac3d28 R08: ffffffffa0641ca8 R09: 0000000000000024
> > [  532.386359] R10: 0000000000004000 R11: 0000000000000000 R12: ffff8800345e2ea8
> > [  532.386359] R13: 0000000000000002 R14: 0000000000000004 R15: ffff88012a875fd8
> > [  532.386359] FS:  0000000000000000(0000) GS:ffff880005ac0000(0000) knlGS:0000000000000000
> > [  532.386359] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > [  532.386359] CR2: 00007f6a0ecba000 CR3: 0000000001804000 CR4: 00000000000406e0
> > [  532.386359] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [  532.386359] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > [  532.386359] Process netns (pid: 17, threadinfo ffff88012a874000, task ffff88012a872500)
> > [  532.485087] Stack:
> > [  532.485087]  ffffffffa063ff92 ffffffff811e2805 ffff880005ac3c98 00000004811e2805
> > [  532.485087] <0> ffff88011d9bd500 0000000300000000 ffff880005ac3da0 ffffffff8103e402
> > [  532.485087] <0> ffff880005ac3d40 ffff880005ac3cb0 0000000000013700 0000000000000010
> > [  532.485087] Call Trace:
> > [  532.485087]  [<ffffffffa063ff92>] ctnetlink_conntrack_event+0x92/0x730 [nf_conntrack_netlink]
> > [  532.485087]  [<ffffffffa058b274>] death_by_timeout+0xc4/0x190 [nf_conntrack]      ## ct->timeout.function(ct->timeout.data); ##
> > [  532.485087]  [<ffffffffa05c544d>] ip_vs_conn_drop_conntrack+0x13d/0x360 [ip_vs]
> > [  532.485087]  [<ffffffffa05ae30d>] ip_vs_conn_expire+0x12d/0x7d0 [ip_vs]		## expired timer ##
> > [  532.485087]  [<ffffffff81059424>] run_timer_softirq+0x174/0x240
> > [  532.549596]  [<ffffffff810544ef>] __do_softirq+0xbf/0x170
> > [  532.549596]  [<ffffffff810040bc>] call_softirq+0x1c/0x30
> > [  532.549596]  [<ffffffff81005d1d>] do_softirq+0x4d/0x80
> > [  532.549596]  [<ffffffff81054791>] local_bh_enable_ip+0xa1/0xb0			      ## ct_write_unlock_bh(idx); ##
> > [  532.549596]  [<ffffffffa05ac25b>] ip_vs_conn_net_cleanup+0xdb/0x160 [ip_vs]  ## ip_vs_flush in-lined ##
> > [  532.576259]  [<ffffffffa05afae1>] __ip_vs_cleanup+0x11/0x90 [ip_vs]
> > [  532.576259]  [<ffffffff812f840e>] cleanup_net+0x5e/0xb0
> > [  532.576259]  [<ffffffff81061468>] run_workqueue+0xb8/0x140
> > [  532.594226]  [<ffffffff8106158a>] worker_thread+0x9a/0x110
> > [  532.594226]  [<ffffffff81065696>] kthread+0x96/0xb0
> > [  532.603788]  [<ffffffff81003fba>] child_rip+0xa/0x20
> > [  532.603788] Code: 47 41 8d 4e ff 48 8d 14 80 48 8d 14 50 31 c0 48 c1 e2 03 48 03 15 7b b2 9b 00 3b 4a 3c 48 8b 7a 30 72 02 f3 c3 0f a3 0f 19 c0 c3 <0f> 0b eb fe 66 0f 1f 44 00 00 48 81 ec 98 00 00 00 41 f6 c0 01 
> > [  532.623046] RIP  [<ffffffff8131aaf6>] netlink_has_listeners+0x36/0x40
> 
> This looks like nfnetlink.c excited and destroyed the nfnl socket, but
> ip_vs was still holding a reference to a conntrack. When the conntrack
> got destroyed it created a ctnetlink event, causing an oops in
> netlink_has_listeners when trying to use the destroyed nfnetlink
> socket.
> 
> Usually this shouldn't happen since network namespace cleanup
> happens in reverse order from registration. In this case the
> reason might be that IPVS has no dependencies on conntrack
> or ctnetlink and therefore can get loaded first, meaning it
> will get cleaned up afterwards.
> 
> Does that make any sense?
> 
Yes,  
>From what I can see is ip_vs have a dependency on nf_conntrack but not on nf_conntrack_netlink
i.e. nf_conntrack is loded first and then ip_vs and last nf_conntrack_netlink

It's hard to tell exactly what was going on in user-space when the lxc container get killed....
Basically there is a lot of traffic (and connections) through the container with ipvs inside,
- ipvs conntrack support is turned on
- iptables with conntrack 
- conntrackd is running 
- ~50 iptables rules
I'm not sure if it's only IPv4 traffic ...

Hmmm... I think I know,  the culprit is conntrackd !! (i.e. it causes loading of ct_netlink)
conntrackd will definitely get killed before the namespace exit starts 
I think it is like you describe, I will make some test tomorrow.
How to solve this is another question....

Thanks a lot Patrick.

Regards
Hans Schillstrom









--
To unsubscribe from this list: send the line "unsubscribe lvs-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Devel]     [Linux NFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]     [X.Org]

  Powered by Linux