Re: conntrackd failover works partially, was Re: conntrack performance test results in INVALID packets

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Bernhard,

Bernhard Bock wrote:
> My next step is to run two firewalls in a cluster with conntrackd.
> 
> The basic setup works like a charm. I have increased the HashSize
> parameter in conntrackd as well. It replicates the states to the backup
> firewall just fine.
> 
> Unfortunately, failover works only in about 50% of all tests. There is
> no obvious pattern as to when this failures occur.
> 
> We trigger the failover softly by advertising a higher priority on the
> backup firewall, not by switching off the primary one. If it goes well,
> we do not loose a single connection. If it doesn't go well, we basically
> loose all connections and the apachebench dies. There are hundreds of
> INVALID packets in the syslog, and also some NEW (not SYN). In this
> case, we also see lost packets in "multicast sequence tracking" in the
> conntrackd stats.

I think that I have reproduced your problem in my testbed. Say you have
two nodes: A and B. Initially, A is primary and B is backup.

1) you generate tons of http traffic: A succesfully replicates states to B.
2) you trigger the fail-over: B becomes primary and A becomes backup. B
successfully recovers the connections. Moreover, if you do `conntrack -L
-p tcp' in A, you see lots of entries.
3) Just a bit later - 30 seconds later or so - you trigger the fail-over
again from B to A. In this case, A fails to recover the entries showing
tons of INVALID messages.

The problem are the entries that are stuck in A (see step 2). Those
former entries clashes with newly committed entries and the TCP state
tracking code gets confused with old state information.

This problem is fixed in the git repository. Now, we purge the entries
in A once this node becomes backup after 15 seconds - this parameter is
tunable via PurgeTimeout. Thus, the old entries does not clash with the
brand new.

Moreover, I have completely reworked the fail-over script, you can find
it under doc/ in the conntrack-tools git tree [1]. You may give it a
try. I expect to release a new version of the conntrack-tools with these
updates soon. New (more complete) documentation is also on the way.

Please, let me know how it goes.

[1] http://git.netfilter.org

-- 
"Los honestos son inadaptados sociales" -- Les Luthiers
--
To unsubscribe from this list: send the line "unsubscribe netfilter" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Netfilter Development]     [Linux Kernel Networking Development]     [Netem]     [Berkeley Packet Filter]     [Linux Kernel Development]     [Advanced Routing & Traffice Control]     [Bugtraq]

  Powered by Linux