Re: early_drop() possible issue?

Luca Pesce <pesce.luca@xxxxxxxxx> · Sun, 1 Nov 2009 23:58:32 +0100

Sorry, I re-worked the patch, because it was buggy (ct_general.use was
incremented, not tested).
Here is the (hopefully) correct one:

--- nf_conntrack_core.c	2009-10-23 00:57:56.000000000 +0200
+++ nf_conntrack_core_modded.c	2009-11-01 23:56:23.000000000 +0100
@@ -506,14 +506,12 @@
 		hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[hash],
 					 hnnode) {
 			tmp = nf_ct_tuplehash_to_ctrack(h);
-			if (!test_bit(IPS_ASSURED_BIT, &tmp->status))
+			if (!test_bit(IPS_ASSURED_BIT, &tmp->status) &&
likely(!nf_ct_is_dying(ct) &&
+				   atomic_read(&ct->ct_general.use)!=0))
 				ct = tmp;
 			cnt++;
 		}

-		if (ct && unlikely(nf_ct_is_dying(ct) ||
-				   !atomic_inc_not_zero(&ct->ct_general.use)))
-			ct = NULL;
 		if (ct || cnt >= NF_CT_EVICTION_RANGE)
 			break;
 		hash = (hash + 1) % nf_conntrack_htable_size;
@@ -523,6 +521,7 @@
 	if (!ct)
 		return dropped;

+	atomic_inc(&ct->ct_general.use);
 	if (del_timer(&ct->timeout)) {
 		death_by_timeout((unsigned long)ct);
 		dropped = 1;


On Sun, Nov 1, 2009 at 10:48 PM,  <pesce.luca@xxxxxxxxx> wrote:
> I noticed that early_drop(), before killing a conntrack, checks if it is
> already dying or
> if its usage count is zero: if one of these two conditions is true, the
> conntrack
> is not killed, which is obviously fine. However, these checks are performed
> outside
> the per-chain loop: if I correctly understand the code, this could lead to
> skip a killable conntrack
> that was before (so more recently used) the least recently used one that is
> already dying.
> For example, consider the case of a hash chain with 10 entries:
>
> - entry number 1 is NOT ASSURED, is not already dying and has ct_general.use
>>=1, so it
> is a good candidate
> - entries from 2 to 9 are ASSURED - not candidates for destroying
> - entry 10 is NOT ASSURED, but is already dying by itself (or has
> ct_general.use==0)
>
> The conntrack examined are only those inside this chain (since 10 >
> NF_CT_EVICTION_RANGE),
> so the chain loop will end with ct pointing to entry number 10, which cannot
> be
> killed. In this way, early_drop() fails to free a hastable entry, while
> entry 1
> was a good candidate.
> This is just one case, but not the only, I think...
> If this analysis is correct, maybe the code should check the dying status
> and the
> usage count together with the assured bit.
> I am attaching the diff coming from this idea.
>
> If I did not correctly understood the code, I am sorry, I apologize for
> that!
>
>
> --- nf_conntrack_core.c 2009-10-23 00:57:56.000000000 +0200
> +++ nf_conntrack_core_modded.c 2009-11-01 22:22:33.000000000 +0100
> @@ -506,14 +506,12 @@
> hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[hash],
> hnnode) {
> tmp = nf_ct_tuplehash_to_ctrack(h);
> - if (!test_bit(IPS_ASSURED_BIT, &tmp->status))
> + if (!test_bit(IPS_ASSURED_BIT, &tmp->status) &&
> unlikely(!nf_ct_is_dying(ct) &&
> + atomic_inc_not_zero(&ct->ct_general.use)))
> ct = tmp;
> cnt++;
> }
>
> - if (ct && unlikely(nf_ct_is_dying(ct) ||
> - !atomic_inc_not_zero(&ct->ct_general.use)))
> - ct = NULL;
> if (ct || cnt >= NF_CT_EVICTION_RANGE)
> break;
> hash = (hash + 1) % nf_conntrack_htable_size;
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html