On Mon, Jun 18, 2018 at 10:37 AM, Florian Westphal <fw@xxxxxxxxx> wrote: > Yi-Hung Wei and Justin Pettit found a race in the garbage collection scheme > used by nf_conncount. > > When we don't find a result for a given tuple that is stored in > the conncount list when doing a lookup in the connection tracking table, > we remove this tuple from our list because we erronously believed that > the entry is stale and conntrack entry long gone. > > This is the common cause, but turns out its not the only one: The > conncount list entry could have been created just before by another cpu, > i.e. the conntrack entry might not yet have been inserted into the global > hash. > > The avoid this, we introduce a timestamp and the owning cpu. > If the entry appears to be stale, evict only if: > 1. The cpu finding its stale is the one that added it > 2. The timestamp is older than two jiffies > > The second constaint allows GC to be taken over by other > cpu too (e.g. because a cpu was offlined, or napi got moved around). > > This most likely also fixes an xt_connlimit imbalance earlier reported > by Dmitry Andrianov. > > Cc: Dmitry Andrianov <dmitry.andrianov@xxxxxxxxxxx> > Reported-by: Justin Pettit <jpettit@xxxxxxxxxx> > Reported-by: Yi-Hung Wei <yihung.wei@xxxxxxxxx> > Signed-off-by: Florian Westphal <fw@xxxxxxxxx> > --- Thanks for the fix. It looks good to me. Acked-by: Yi-Hung Wei <yihung.wei@xxxxxxxxx> > diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c > index d8383609fe28..85aa7be75bbd 100644 > +static const struct nf_conntrack_tuple_hash * > +find_or_evict(struct net *net, struct nf_conncount_tuple *conn) > +{ > + const struct nf_conntrack_tuple_hash *found; > + unsigned long a, b; > + int cpu = raw_smp_processor_id(); > + > + found = nf_conntrack_find_get(net, &conn->zone, &conn->tuple); > + if (found) > + return found; > + > + b = conn->jiffies32; > + a = (u32)jiffies; We might have issue when jiffies roll over u32. However, I think it makes sense to save 4 bytes to trade off the rare corner case. > + > + /* conn might have been added just before by another cpu and > + * might still be unconfirmed. In this case, nf_conntrack_find() > + * returns no result. Thus only evict if this cpu added the > + * stale entry or if the entry is older than two jiffy. > + */ > + if (conn->cpu == cpu || time_after(a, b)) { > + hlist_del(&conn->node); > + kmem_cache_free(conncount_conn_cachep, conn); > + } > + > + return NULL; > +} -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html