On Thu, Feb 18, 2010 at 11:39 AM, Afi Gjermund <afigjermund@xxxxxxxxx> wrote: > On Thu, Feb 18, 2010 at 10:19 AM, Patrick McHardy <kaber@xxxxxxxxx> wrote: >> Afi Gjermund wrote: >>> On Thu, Feb 18, 2010 at 10:07 AM, Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote: >>>>>>> Shouldn't the value after the flush be 0? The traffic that has created >>>>>>> this mess is from a REDIRECT rule in the PREROUTING chain of the 'nat' >>>>>>> table. >>>>>> Could you post a copy of these rules ? >>>>>> >>>>> iptables -t nat -A PREROUTING -p tcp -s X.X.X.X -d X.X.X.X --sport X >>>>> --dport X -j REDIRECT --to-port X >>>> Yes I understood you were using such rules, but I cannot understand how >>>> it can trigger without real nics being plugged. So I asked you some >>>> details, apprently you dont want to provide them and prefer to hide from >>>> us :) >>>> >>> Lol, sorry. The X values are dynamic and depend on what network the >>> device happens to be on, as well as the ephemeral source port. >>> >>> iptables -t nat -A PREROUTING -p tcp -s 172.168.8.45 -d 172.168.8.200 >>> --sport 4351 --dport 4500 -j REDIRECT --to-port 45001 >> >> NAT is unlikely to be the cause since its widely used and there >> are no other reports of leaks. Please describe your full setup, >> especially things like traffic scheduling, network devices, >> userspace queueing etc etc. >> > > The device has 2 network interfaces that are configured in a bridge > (eth0,eth1). The traffic scheduling has not been changed from the > default kernel configuration. > > Problem path: > The problem I am seeing is that my tcp connections enter the > /proc/net/nf_conntrack table, then disappear over time but the > nf_conntrack_count never decreases. Over time, the nf_conntrack_count > hits the 4096 nf_conntrack_max and the kernel begins to drop packets > even though the /proc/net/nf_conntrack table is not full (has < 100 > connections). > > In testing I decided to set the nf_conntrack_max to 100, and fill the > table via the connections above. Then remove both ethernet cables to > ensure no new connections could be made. I also set the > nf_conntrack_tcp_timeout_established to 60 seconds. I left this for 2 > hours and saw that the /proc/net/nf_conntrack table was empty while > the nf_conntrack_count was still 100. > > I also created a kernel module that calls the nf_conntrack_flush() > function, this seems to only clear the /proc/net/nf_conntrack table, > but not the count. If I also do an atomic_set(&nf_conntrack_count,0) > then (obviously) the count becomes 0. It is as if the connections are > being removed from the table, but the count is not being decremented, > which I am not sure why. As far as I understand it, they should be in > sync. > I have found the issue that was causing this problem. A userspace application that was using the NFQueue mechanism to queue data to userspace was returning a verdict of STOLEN on the first UDP packet seen. This appears to have been leaving entries in the connection table that could not be flushed via nf_conntrack_flush(). When changing the verdict to DROP, the problem no longer existed. This was found as I noticed the Timer value of the connections within the table remained at 3000 (30 in nf_conntrack_udp_timeout x 100). Feb 18 22:56:31 titan user.info kernel: =========================== Table Dump ========================= Feb 18 22:56:31 titan user.info kernel: ---- Set ---- Feb 18 22:56:31 titan user.info kernel: Timer is : 3000 Feb 18 22:56:31 titan user.info kernel: tuple dump: IP_CT_DIR_ORIGINAL Feb 18 22:56:31 titan user.info kernel: Feb 18 22:56:31 titan user.warn kernel: tuple c321cc70: l3num 2 protonum 17 srcIP 172.16.8.45 srcPort 4858 -> dstIP 172.16.8.7 dstPort 45001 Feb 18 22:56:31 titan user.info kernel: tuple dump: IP_CT_DIR_REPLY Feb 18 22:56:31 titan user.info kernel: Feb 18 22:56:31 titan user.warn kernel: tuple c321cca8: l3num 2 protonum 17 srcIP 172.16.8.7 srcPort 45001 -> dstIP 172.16.8.45 dstPort 4858 Feb 18 22:56:31 titan user.info kernel: ---- End Set ---- Feb 18 22:56:31 titan user.info kernel: =========================== End Table Dump ========================= Feb 18 22:57:03 titan user.info kernel: =========================== Table Dump ========================= Feb 18 22:57:03 titan user.info kernel: ---- Set ---- Feb 18 22:57:03 titan user.info kernel: Timer is : 3000 Feb 18 22:57:03 titan user.info kernel: tuple dump: IP_CT_DIR_ORIGINAL Feb 18 22:57:03 titan user.info kernel: Feb 18 22:57:03 titan user.warn kernel: tuple c321cc70: l3num 2 protonum 17 srcIP 172.16.8.45 srcPort 4858 -> dstIP 172.16.8.7 dstPort 45001 Feb 18 22:57:03 titan user.info kernel: tuple dump: IP_CT_DIR_REPLY Feb 18 22:57:03 titan user.info kernel: Feb 18 22:57:03 titan user.warn kernel: tuple c321cca8: l3num 2 protonum 17 srcIP 172.16.8.7 srcPort 45001 -> dstIP 172.16.8.45 dstPort 4858 Feb 18 22:57:03 titan user.info kernel: ---- End Set ---- Feb 18 22:57:03 titan user.info kernel: =========================== End Table Dump ========================= Thank you all for your help! Hopefully this may help other people as well. Afi -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html