Patrick McHardy wrote: > Patrick McHardy wrote: >> BORBELY Zoltan wrote: >>> On Tue, Nov 18, 2008 at 12:07:20PM +0100, Patrick McHardy wrote: >>>>> --- /tmp/nf_conntrack_netlink.c-orig 2008-09-29 >>>>> 23:28:55.000000000 +0200 >>>>> +++ /tmp/nf_conntrack_netlink.c 2008-09-29 23:29:11.000000000 +0200 >>>>> @@ -1177,8 +1177,8 @@ >>>>> ct->master = master_ct; >>>>> } >>>>> - add_timer(&ct->timeout); >>>>> nf_conntrack_hash_insert(ct); >>>>> + add_timer(&ct->timeout); >>>>> rcu_read_unlock(); >>>> That code looks very fishy. We should be holding the conntrack lock, >>>> otherwise the addition is not only racy against the timer, but also >>>> against addition of identical conntracks. Let me look into what >>>> happened here. >>> >>> We have experienced a lot of kernel crashes, _every time_ in the >>> death_by_timeout() function while we were trying to add a new conntrack >>> entry from userspace via netlink (attached the disassembled version >>> of the function, ===> points to the EIP upon the crash). There was a >>> possibility, that we tried to add conntrack entries with zero timeout >>> value, maybe it's necessary to trigger this crash. The previous patch >>> has definitly solved the problem for us. >>> >>> I've got photos from various crashes, but it takes a little time to >>> find them. Please let me know if you want to see them. >> >> Thats not necessary, the problem is pretty obvious, I was mainly >> wondering at what point we broke it. >> >> I'll send you a patch soon. > > Could you try whether this patch fixes the problem? > > Pablo, do you recall the reason why the lock isn't held in > ctnetlink_create_conntrack()? The creation is done under the nfnl_mutex so that requests to create identical entries cannot race. Of course, this is not enough to avoid the race with the timer if we set a very small timer for a conntrack :(. AFAICS, we don't need to enclose the whole conntrack creation path. Would you prefer the patch attached? This patch should apply fine to 2.6.28-rc. I can send this patch to -stable. BTW, this patch may conflict with my patch enqueued for 2.6.29 that adds userspace reporting, let me know if I can give you a hand in some way (like sending it on top of this one or yours again, whatever). -- "Los honestos son inadaptados sociales" -- Les Luthiers
netfilter: ctnetlink: fix racy timer in the creation path If we set a very small timer for new conntrack created via ctnetlink, the entry may expire before it is in the hashes, resulting in a crash. To fix this problem, the timer addition and the insertion into the hashes is done holding the nf_conntrack spin lock bh. Signed-off-by: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> --- include/net/netfilter/nf_conntrack.h | 2 +- net/netfilter/nf_conntrack_core.c | 7 ++----- net/netfilter/nf_conntrack_netlink.c | 4 +++- 3 files changed, 6 insertions(+), 7 deletions(-) diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h index b76a868..a05e2e2 100644 --- a/include/net/netfilter/nf_conntrack.h +++ b/include/net/netfilter/nf_conntrack.h @@ -197,7 +197,7 @@ extern void nf_ct_free_hashtable(struct hlist_head *hash, int vmalloced, extern struct nf_conntrack_tuple_hash * __nf_conntrack_find(struct net *net, const struct nf_conntrack_tuple *tuple); -extern void nf_conntrack_hash_insert(struct nf_conn *ct); +extern void __nf_conntrack_insert(struct nf_conn *ct); extern void nf_conntrack_flush(struct net *net); diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 622d7c6..22246ec 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -298,18 +298,15 @@ static void __nf_conntrack_hash_insert(struct nf_conn *ct, &net->ct.hash[repl_hash]); } -void nf_conntrack_hash_insert(struct nf_conn *ct) +void __nf_conntrack_insert(struct nf_conn *ct) { unsigned int hash, repl_hash; hash = hash_conntrack(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple); repl_hash = hash_conntrack(&ct->tuplehash[IP_CT_DIR_REPLY].tuple); - - spin_lock_bh(&nf_conntrack_lock); __nf_conntrack_hash_insert(ct, hash, repl_hash); - spin_unlock_bh(&nf_conntrack_lock); } -EXPORT_SYMBOL_GPL(nf_conntrack_hash_insert); +EXPORT_SYMBOL_GPL(__nf_conntrack_insert); /* Confirm a connection given skb; places it in hash table */ int diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c index b132124..2eb8fd3 100644 --- a/net/netfilter/nf_conntrack_netlink.c +++ b/net/netfilter/nf_conntrack_netlink.c @@ -1151,8 +1151,10 @@ ctnetlink_create_conntrack(struct nlattr *cda[], ct->master = master_ct; } + spin_lock_bh(&nf_conntrack_lock); add_timer(&ct->timeout); - nf_conntrack_hash_insert(ct); + __nf_conntrack_insert(ct); + spin_unlock_bh(&nf_conntrack_lock); rcu_read_unlock(); return 0;