Running an active/active firewall/router (xt_cluster?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear netfilter experts,

we are trying to setup an active/active firewall, making use of "xt_cluster".
We can configure the switch to act like a hub, i.e. both machines can share the same MAC and IP and get the same packets without additional ARPtables tricks.

So we set rules like:

 iptables -I PREROUTING -t mangle -i external_interface -m cluster --cluster-total-nodes 2 --cluster-local-node 1 --cluster-hash-seed 0xdeadbeef -j MARK --set-mark 0xffff
 iptables -A PREROUTING -t mangle -i external_interface -m mark ! --mark 0xffff -j DROP

Ideally, it we'd love to have the possibility to scale this to more than two nodes, but let's stay with two for now.

Basic tests show that this works as expected, but the details get messy.

1. Certainly, conntrackd is needed to synchronize connection states.
   But is it always "fast enough"?
   xt_cluster seems to match by the src_ip of the original direction of the flow[0] (if I read the code correctly),
   but what happens if the reply to an outgoing packet arrives at both firewalls before state is synchronized?
   We are currently using conntrackd in FTFW mode with a direct link, set "DisableExternalCache", and additonally set "PollSecs 15" since without that it seems
   only new and destroyed connections are synced, but lifetime updates for existing connections do not propagate without polling.
   Maybe another way which e.g. may use XOR(src,dst) might work around tight synchronization requirements, or is it possible to always uses the "internal" source IP?
   Is anybody doing that with a custom BPF?

2. How to do failover in such cases?
   For failover we'd need to change these rules (if one node fails, the total-nodes will change).
   As an alternative, I found [1] which states multiple rules can be used and enabled / disabled,
   but does somebody know of a cleaner (and easier to read) way, also not costing extra performance?

3. We have several internal networks, which need to talk to each other (partially with firewall rules and NATting),
   so we'd also need similar rules there, complicating things more. That's why a cleaner way would be very welcome :-).

4. Another point is how to actually perform the failover. Classical cluster suites (corosync + pacemaker)
   are rather used to migrate services, but not to communicate node ids and number of total active nodes.
   They can probably be tricked into doing that somehow, but they are not designed this way.
   TIPC may be something to use here, but I found nothing "ready to use".

You may also tell me there's a better way to do this than use xt_cluster (custom BPF?) — we've up to now only done "classic" active/passive setups,
but maybe someone on this list has already done active/active without commercial hardware, and can share experience from this?

Cheers and thanks in advance,
	Oliver

PS: Please keep me in CC, I'm not subscribed to the list. Thanks!

[0] https://github.com/torvalds/linux/blob/10a3efd0fee5e881b1866cf45950808575cb0f24/net/netfilter/xt_cluster.c#L16-L19
[1] https://lore.kernel.org/netfilter-devel/499BEBBF.7080705@xxxxxxxxxxxxx/

--
Oliver Freyermuth
Universität Bonn
Physikalisches Institut, Raum 1.047
Nußallee 12
53115 Bonn
--
Tel.: +49 228 73 2367
Fax:  +49 228 73 7869
--

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


[Index of Archives]     [Linux Netfilter Development]     [Linux Kernel Networking Development]     [Netem]     [Berkeley Packet Filter]     [Linux Kernel Development]     [Advanced Routing & Traffice Control]     [Bugtraq]

  Powered by Linux