On Tue, Jun 1, 2010 at 1:05 PM, Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote: > Le mardi 01 juin 2010 à 08:28 +0800, Changli Gao a écrit : >> On Tue, Jun 1, 2010 at 5:21 AM, Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote: >> > >> > I had a look at current conntrack and found the 'unconfirmed' list was >> > maybe a candidate for a potential blackhole. >> > >> > That is, if a reader happens to hit an entry that is moved from regular >> > hash table slot 'hash' to unconfirmed list, >> >> Sorry, but I can't find where we do this things. unconfirmed list is >> used to track the unconfirmed cts, whose corresponding skbs are still >> in path from the first to the last netfilter hooks. As soon as the >> skbs end their travel in netfilter, the corresponding cts will be >> confirmed(moving ct from unconfirmed list to regular hash table). >> > > So netfilter is a monolithic thing. > > When a packet begins its travel into netfilter, you guarantee that no > other packet can also begin its travel and find an unconfirmed > conntrack ? > > I wonder why we use atomic ops then to track the confirmed bit :) seems no need. > > >> unconfirmed list should be small, as networking receiving is in BH. > > So according to you, netfilter/ct runs only in input path ? No. there are another entrances: local out and nf_reinject. If there isn't any packet queued, as netfilter is in atomic context, the nubmer of unconfirmed cts should be small( at most, 2 * nr_cpu?). > > So I assume a packet is handled by CPU X, creates a new conntrack > (possibly early droping an old entry that was previously in a standard > hash chain), inserted in unconfirmed list. > Oh, Thanks, I got it. > _You_ guarantee another CPU > Y, handling another packet, possibly sent by a hacker reading your > netdev mails, cannot find the conntrack that was early dropped ? > >> How about implementing unconfirmed list as a per cpu variable? > > I first implemented such a patch to reduce cache line contention, then I > asked to myself : What is exactly an unconfirmed conntrack ? Can their > number be unbounded ? If yes, we have a problem, even on a two cpus > machine. Using two lists instead of one wont solve the fundamental > problem. > > The real question is, why do we need this unconfirmed 'list' in the > first place. Is it really a private per cpu thing ? No, it isn't really private. But most of time, it is accessed locally, if it is implemented as a per cpu var. > Can you prove this, > in respect of lockless lookups, and things like NFQUEUE ? > > Each conntrack object has two list anchors. One for IP_CT_DIR_ORIGINAL, > one for IP_CT_DIR_REPLY. > > Unconfirmed list use the first anchor. This means another cpu can > definitely find an unconfirmed item in a regular hash chain, since we > dont respect an RCU grace period before re-using an object. > > If memory was not a problem, we probably would use a third anchor to > avoid this, or regular RCU instead of SLAB_DESTROY_BY_RCU variant. > thanks for your explaining. -- Regards, Changli Gao(xiaosuo@xxxxxxxxx) -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html