Ohai! After hours of digging around, eating logs, debugging tcpdumps and having some conversations on #shorewall at Freenode, I am convinced that the setup I was trying to build is simply impossible to run well.. First of all, conntrackd is working fine here.. All connection states are synchronized correctly. It seems the connection was not already synced as the SYN ACK accessed the second node. It feels more like a race condition... I decided to use a stateless firewall ruleset for all the transit traffic flowing through the active/active asymmetric multi-path router cluster. Until now I only used "Filter From Userspace". I did not found any explanations regarding the difference between Userspace and Kernelspace.. Could someone shed some light on this? On 2/28/19 5:57 PM, n3phr0n wrote: > Hey ML! > > Sorry for the noise: resending because of bad encoding and typos... > > I am trying to build an active/active asymmetric multi-path cluster with > a stateful firewall on top. > > Basically I have 2 routers connected to a different AS via BGP on > multiple Links. > > I am using keepalived to have a VIP on LAN side to route all internal > traffic through it. As Packet flow is eventually exiting via one link > and returning on the other, I need to synchronize all known connection > states. Otherwise the Firewall on the other node will drop the packets > as its not aware of e.g. the tcp connection itself. > > My configuration so far: > >> Sync { >> Mode FTFW { >> DisableExternalCache On >> ResendQueueSize 131072 >> PurgeTimeout 5 >> ACKWindowSize 300 >> } >> >> Multicast { >> IPv4_address 225.0.0.50 >> Group 3780 >> IPv4_interface 10.4.48.14 >> Interface bond2 >> SndSocketBuffer 1249280 >> RcvSocketBuffer 1249280 >> Checksum On >> } >> >> Options { >> TCPWindowTracking On >> ExpectationSync On >> } >> >> } >> >> General { >> Nice -20 >> HashSize 32768 >> HashLimit 131072 >> LogFile /var/log/conntrackd.log >> LockFile /var/lock/conntrack.lock >> UNIX { >> Path /var/run/conntrackd.ctl >> Backlog 20 >> } >> NetlinkBufferSize 2097152 >> NetlinkBufferSizeMaxGrowth 8388608 >> Filter From Userspace { >> Protocol Accept { >> TCP >> SCTP >> DCCP >> UDP >> ICMP >> IPv6-ICMP >> } >> Address Ignore { >> IPv4_address 127.0.0.0/8 >> IPv4_address 46.243.94.14 >> IPv4_address 10.4.48.14 >> IPv4_address 10.243.163.14 >> IPv4_address 172.27.3.14 >> IPv4_address 169.254.0.0/16 >> IPv4_address 10.4.48.1 >> IPv6_address ::1/128 >> IPv6_address 2a02:2b80:101:677::14 >> } >> } >> } > > Second node is X.X.X.15 > > Actually conntrackd is working so far: > >> node1 $ conntrackd -s >> cache internal: >> current active connections: 62959 >> connections created: 489195 failed: 0 >> connections updated: 1156570 failed: 0 >> connections destroyed: 426236 failed: 0 >> >> external inject: >> connections created: 221071 failed: 0 >> connections updated: 21 failed: 0 >> connections destroyed: 61344 failed: 0 >> >> traffic processed: >> 0 Bytes 0 Pckts >> >> multicast traffic (active device=bond2): >> 145907924 Bytes sent 33912008 Bytes recv >> 1978321 Pckts sent 350383 Pckts recv >> 0 Error send 0 Error recv >> >> message tracking: >> 0 Malformed msgs 3 Lost msgs > >> node2 $ conntrackd -s >> cache internal: >> current active connections: 1537 >> connections created: 224062 failed: 0 >> connections updated: 21 failed: 0 >> connections destroyed: 222525 failed: 0 >> >> external inject: >> connections created: 491746 failed: 0 >> connections updated: 1160477 failed: 0 >> connections destroyed: 348601 failed: 0 >> >> traffic processed: >> 0 Bytes 0 Pckts >> >> multicast traffic (active device=bond2): >> 34254992 Bytes sent 147212112 Bytes recv >> 353358 Pckts sent 1995846 Pckts recv >> 0 Error send 0 Error recv >> >> message tracking: >> 0 Malformed msgs 0 Lost msgs > > I was able to see the synchronization of an ICMP connection and the > incoming packet flow was actually accepted on the second node as the > state was known. It was _not_ working before conntrackd was running. > > But its not working for TCP connections which are known on node1 as > SYN_SENT UNREPLIED. They do not get synced to the other node and hence > the firewall on the second node is dropping the SYN_ACK packet. > > What am I missing? > -- Best regards, Michael Gerlach Development Operations T +49 761 88788 321 F +49 761 88788 9 michael.gerlach@xxxxxxxxxxx www.reservix.de Reservix GmbH, Postfach 1212, 79012 Freiburg Hauptsitz: Reservix GmbH, Humboldtstraße 2, 79098 Freiburg Sitz der Gesellschaft: Freiburg im Breisgau, AG Freiburg, HRB 700054 Geschäftsführung: Helge Hollander, Katrin Stahlberg, Johannes Tolle
Attachment:
signature.asc
Description: OpenPGP digital signature