[RFC] netfilter: conntrack race between dump_table and destroy

Stephen Hemminger <shemminger@xxxxxxxxxx> · Wed, 24 Nov 2010 22:27:16 -0800

A customer reported a crash and the backtrace showed that
ctnetlink_dump_table was running while a conntrack entry was
being destroyed.  It looks like the code for walking the table
with hlist_nulls_for_each_entry_rcu is not correctly handling the
case where it finds a deleted entry.

According to RCU documentation, when using hlist_nulls the reader
must handle the case of seeing a deleted entry and not proceed
further down the linked list.  For lookup the correct behavior would
be to restart the scan, but that would generate duplicate entries.

This patch is the simplest one of three alternatives:
  1) if dead entry detected, skip the rest of the hash chain (see below)
  2) remember skb location at start of hash chain and rescan that chain
  3) switch to using a full lock when scanning rather than RCU.
It all depends on the amount of effort versus consistency of results.

Signed-off-by: Stephen Hemminger <shemminger@xxxxxxxxxx>

--- a/net/netfilter/nf_conntrack_netlink.c	2010-11-24 14:11:27.661682148 -0800
+++ b/net/netfilter/nf_conntrack_netlink.c	2010-11-24 14:22:28.431980247 -0800
@@ -651,8 +651,12 @@ restart:
 			if (NF_CT_DIRECTION(h) != IP_CT_DIR_ORIGINAL)
 				continue;
 			ct = nf_ct_tuplehash_to_ctrack(h);
+
+			/* if entry is being deleted then can not proceed
+			 * past this point. */
 			if (!atomic_inc_not_zero(&ct->ct_general.use))
-				continue;
+				break;
+
 			/* Dump entries of a given L3 protocol number.
 			 * If it is not specified, ie. l3proto == 0,
 			 * then dump everything. */
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html