On Thu, Dec 27, 2018 at 12:18:30PM -0600, Shawn Bohrer wrote: > On Thu, Dec 27, 2018 at 07:05:39PM +0100, Pablo Neira Ayuso wrote: > > On Thu, Dec 27, 2018 at 11:03:04AM -0600, Shawn Bohrer wrote: > > > On Wed, Dec 26, 2018 at 11:08:43PM +0100, Pablo Neira Ayuso wrote: > > [...] > > > > @@ -451,11 +426,6 @@ count_tree(struct net *net, > > > > return 0; /* hotdrop */ > > > > } else if (ret == NF_CONNCOUNT_ADDED) { > > > > return rbconn->list.count; > > > > - } else { > > > > - /* NF_CONNCOUNT_SKIP, rbconn is already > > > > - * reclaimed by gc, insert a new tree node > > > > - */ > > > > - break; > > > > } > > > > } > > > > } > > > > > > The packet path can run concurrently on multiple CPUs correct? If > > > yes, then there is a race where an empty node is about to be freed by > > > tree_nodes_free() which holds the rbconn->list.list_lock, while > > > nf_conncount_add() inside count_tree is waiting to acquire the lock. > > > Once tree_nodes_free() releases the lock nf_conncount_add() will add > > > the connection to the list, but it is too late the node has already > > > been rb_erased from the tree. > > > > Right, if the CPU walking over nf_conncount_add() loses race, we have > > a connection object on a list that is gone. > > > > > I think count_tree needs to break here if addit is true. Then > > > insert_tree will handle calling nf_conncount_add() while holding both > > > the rbconn->list.list_lock and nf_conncount_locks. > > > > Therefore we still need the NF_CONNCOUNT_SKIP handling (or rename it > > to NF_CONNCOUNT_LOCK as Florian suggested), so we can fall back on > > adding a new list in this case. It would be probably good to add a > > comment on this on the code. > > I don't think you need NF_CONNCOUNT_SKIP or NF_CONNCOUNT_LOCK. Just > don't call nf_conncount_add() from count_tree(): > > diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c > index b33df77bd7a3..aa74c9be89e3 100644 > --- a/net/netfilter/nf_conncount.c > +++ b/net/netfilter/nf_conncount.c > @@ -420,13 +420,8 @@ count_tree(struct net *net, > > if (!addit) > return rbconn->list.count; > - > - ret = nf_conncount_add(&rbconn->list, tuple, zone); > - if (ret == NF_CONNCOUNT_ERR) { > - return 0; /* hotdrop */ > - } else if (ret == NF_CONNCOUNT_ADDED) { > - return rbconn->list.count; > - } > + else > + break; > } > } > > insert_tree() and tree_nodes_free() can't race because they are both holding > nf_conncount_locks. Good idea, we can collapse this chunk to this patch.