Hi, My responses are inlined.. > -----Original Message----- > From: Patrick McHardy [mailto:kaber@xxxxxxxxx] > Sent: Monday, February 25, 2008 5:41 PM > To: Juneja Kapil > Cc: netfilter-devel@xxxxxxxxxxxxxxx; Medve Emilian > Subject: Re: [PATCH] nf_conntrack_core: Updated nf_conntrack > to destroy/refresh conn irrespective of del_timer status > > Kapil Juneja wrote: > > Currently NF_CONNTRACK assumes that a running timer is > present before refreshing > > the connection or destroying it. This may not be the case > when, for example, > > another forwarding engine hooks up to it to listen to new > connections > > but disables the NF_CONNTRACK timer in order to have more control. > > In such a scenario, only control packets may be terminated > to NF_CONNTRACK for > > it to decode and update the connection status. It will not > impact the present > > scenario of kernel forwarding without the aid of any > forwarding engine. > > Do you have a pointer to the code you're talking about? The forwarding engine concept is same as the Grand Unified Flow Cache idea mooted by Rusty Russel some time back: http://lwn.net/Articles/194443/ Our architecture runs on three components - Linux NF_CONNTRACK, a Control (Linux) Module(CM), and a Forwarding Engine or Flow Caching system (FC): 1) The CM registers itself to the NF_CONNTRACK notifier chain. Whenever an assured connection notifier event is received, it extracts all the relevant tuple parameters (Src IP, Dst IP, Protocol, Src Port, Dst Port etc.) and caches them to the FC. Due to reasons mentioned subsequently, it also disables the timer for the said conntrack object. The object, however, remains in the conntrack list as long as it is not destroyed. 3) The FC, sitting at the ethernet driver level, sends all the data packets belonging to the cached connection directly to the outbound port (as identified in step 2), bypassing the Linux stack altogether. All the packets not belonging to either of the cached tuples are terminated to the Linux stack. Also TCP control packets FIN/RST/SYN are terminated irrespective of wether the connection is cached or not. 4) With assistance from the FC, the CM also runs aging on the cached connections (hence the requirement of deleting the NF_CONNTRACK timer in step 2) 5) Cached connections can be terminated (i.e, removed from cache) in two ways: i) Aging out by the CM: In this scenario, the CM removes the connection tuple from FC as well as NF_CONNTRACK by calling the corresponding timer destroy function directly. ii) Destroy via TCP control packet: All the FIN-ACK, RST, RST-ACK packets are send to conntrack irrespective of the fact that they match a cached tuple. They are picked up by the TCP conntrack module which restarts the accounting and refreshes the connection state. It is at this point that the first chunk of this patch comes into picture. 6) When the NF_CONNTRACK module is removed, it iterates through the list to destroy the detected connections. Currently, it does not remove those connections whose timers have gone off (which is the case with connections cached to FC). This is fixed by the second chunk of the patch. > > > + if (newtime - ct->timeout.expires >= HZ) { > > + /* > > + * The timer could have already been deleted > > + * while still alive (for example connection > > + * offloaded to a forwarding module other than > > + * the kernel stack). > > + */ > > + mod_timer(&ct->timeout, newtime); > > event = IPCT_REFRESH; > > This adds a race, we don't want to update the timer if it already > went off this that means the connection is already destroyed. > Same problem with the other chunk. > A timer call would have invalidated the conntrack by a call to 'death_by_timeout' (or similar such routine), thereby rendering this check redundant. Theoretically, I think the check is irrelevant unless a hypothetical timeout doesn't really invalidate the conntrack. Can you describe the race scenario mentioned by you? The other chunk for iterative cleaning of list during module removal holds true on similar reasoning. - To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html