On Thu, 25 Nov 2010 07:34:33 +0100 Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote: > Le mercredi 24 novembre 2010 à 22:27 -0800, Stephen Hemminger a écrit : > > A customer reported a crash and the backtrace showed that > > ctnetlink_dump_table was running while a conntrack entry was > > being destroyed. It looks like the code for walking the table > > with hlist_nulls_for_each_entry_rcu is not correctly handling the > > case where it finds a deleted entry. > > > > According to RCU documentation, when using hlist_nulls the reader > > must handle the case of seeing a deleted entry and not proceed > > further down the linked list. For lookup the correct behavior would > > be to restart the scan, but that would generate duplicate entries. > > > > This patch is the simplest one of three alternatives: > > 1) if dead entry detected, skip the rest of the hash chain (see below) > > 2) remember skb location at start of hash chain and rescan that chain > > 3) switch to using a full lock when scanning rather than RCU. > > It all depends on the amount of effort versus consistency of results. > > > > Signed-off-by: Stephen Hemminger <shemminger@xxxxxxxxxx> > > > > > > --- a/net/netfilter/nf_conntrack_netlink.c 2010-11-24 14:11:27.661682148 -0800 > > +++ b/net/netfilter/nf_conntrack_netlink.c 2010-11-24 14:22:28.431980247 -0800 > > @@ -651,8 +651,12 @@ restart: > > if (NF_CT_DIRECTION(h) != IP_CT_DIR_ORIGINAL) > > continue; > > ct = nf_ct_tuplehash_to_ctrack(h); > > + > > + /* if entry is being deleted then can not proceed > > + * past this point. */ > > if (!atomic_inc_not_zero(&ct->ct_general.use)) > > - continue; > > + break; > > + > > /* Dump entries of a given L3 protocol number. > > * If it is not specified, ie. l3proto == 0, > > * then dump everything. */ > > -- > > Hmm... > > How restarting the loop can be a problem ? At this point in the loop, some entries have been placed in the netlink dump buffer. Restarting the loop will cause duplicate entries. > There must be a bug somewhere else that your patch try to avoid, not to > really fix. > > Normally, destroyed ct is removed eventually from the chain, so this > lookup should stop. Because hlist_nulls it is possible to walk into a dead entry, in that case the next pointer is no longer valid. -- -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html