Le jeudi 22 avril 2010 à 15:17 +0200, Patrick McHardy a écrit : > Changli Gao wrote: > > On Thu, Apr 22, 2010 at 8:58 PM, Jesper Dangaard Brouer <hawk@xxxxxxx> wrote: > >> At an unnamed ISP, we experienced a DDoS attack against one of our > >> customers. This attack also caused problems for one of our Linux > >> based routers. > >> > >> The attack was "only" generating 300 kpps (packets per sec), which > >> usually isn't a problem for this (fairly old) Linux Router. But the > >> conntracking system chocked and reduced pps processing power to > >> 40kpps. > >> > >> I do extensive RRD/graph monitoring of the machines. The IP conntrack > >> searches in the period exploded, to a stunning 700.000 searches per > >> sec. > >> > >> http://people.netfilter.org/hawk/DDoS/2010-04-12__001/conntrack_searches001.png > >> > >> First I though it might be caused by bad hashing, but after reading > >> the kernel code (func: __nf_conntrack_find()), I think its caused by > >> the loop restart (goto begin) of the conntrack search, running under > >> local_bh_disable(). These RCU changes to conntrack were introduced in > >> ea781f19 by Eric Dumazet. > >> > >> Code: net/netfilter/nf_conntrack_core.c > >> Func: __nf_conntrack_find() > >> > >> struct nf_conntrack_tuple_hash * > >> __nf_conntrack_find(struct net *net, const struct nf_conntrack_tuple *tuple) > >> { > >> struct nf_conntrack_tuple_hash *h; > >> struct hlist_nulls_node *n; > >> unsigned int hash = hash_conntrack(tuple); > >> > >> /* Disable BHs the entire time since we normally need to disable them > >> * at least once for the stats anyway. > >> */ > >> local_bh_disable(); > >> begin: > >> hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[hash], hnnode) { > >> if (nf_ct_tuple_equal(tuple, &h->tuple)) { > >> NF_CT_STAT_INC(net, found); > >> local_bh_enable(); > >> return h; > >> } > >> NF_CT_STAT_INC(net, searched); > >> } > >> /* > >> * if the nulls value we got at the end of this lookup is > >> * not the expected one, we must restart lookup. > >> * We probably met an item that was moved to another chain. > >> */ > >> if (get_nulls_value(n) != hash) > >> goto begin; > >> local_bh_enable(); > >> > > > > We should add a retry limit there. > > We can't do that since that would allow false negatives. If one hash slot is under attack, then there is a bug somewhere. If we cannot avoid this, we can fallback to a secure mode at the second retry, and take the spinlock. Tis way, most of lookups stay lockless (one pass), and some might take the slot lock to avoid the possibility of a loop. I suspect a bug elsewhere, quite frankly ! We have a chain that have an end pointer that doesnt match the expected one. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html