Andrew Vagin <avagin@xxxxxxxxxxxxx> wrote: > On Thu, Jan 09, 2014 at 09:56:22PM +0100, Florian Westphal wrote: > > Andrew Vagin <avagin@xxxxxxxxxxxxx> wrote: > > > Can we allocate conntrack with zero ct_general.use and increment it at > > > the first time before inserting the conntrack into the hash table? > > > When conntrack is allocated it is attached exclusively to one skb. > > > It must be destroyed with skb, if it has not been confirmed, so we > > > don't need refcnt on this stage. > > > > > > I found only one place, where a reference counter of unconfirmed > > > conntract can incremented. It's ctnetlink_dump_table(). > > > > What about skb_clone, etc? They will also increment the refcnt > > if a conntrack entry is attached to the skb. > > We can not attach an unconfirmed conntrack to a few skb, because s/few/new/? > nf_nat_setup_info can be executed concurrently for the same conntrack. > > How do we avoid this race condition for cloned skb-s? Simple, the assumption is that only one cpu owns the nfct, so it does not matter if the skb is cloned in between, as there are no parallel users. The only possibility (that I know of) to violate this is to create a bridge, enable call-iptables sysctl, add -j NFQUEUE rules and then wait for packets that need to be forwarded to several recipients, e.g. multicast traffic. see http://marc.info/?l=netfilter-devel&m=131471083501656&w=2 or search 'netfilter: nat: work around shared nfct struct in bridge case' -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html