On Fri, Jan 13, 2023 at 12:45:03AM +0100, Florian Westphal wrote: > Russell King (Oracle) <linux@xxxxxxxxxxxxxxx> wrote: > > Given the packet counts as per my example above, it looks like > > conntrack only saw: > > > > src=180.173.2.183 dst=78.32.30.218 SYN > > src=78.32.30.218 dst=180.173.2.183 SYN+ACK > > src=180.173.2.183 dst=78.32.30.218 ACK > > > > and I suspect at that point, the connection went silent - until > > Exim timed out and closed the connection, as does seem to be the > > case: > > > > 2023-01-11 21:32:04 no host name found for IP address 180.173.2.183 > > 2023-01-11 21:33:05 SMTP command timeout on connection from [180.173.2.183]:64332 I=[78.32.30.218]:25 > > > > but if Exim closed the connection, why didn't conntrack pick it up? > > Yes, thats the question. Exim closing the connection should have > conntrack at least pick up a fin packet from the mail server (which > should move the entry to the 2 minute fin timeout). Okay, update this morning. I left tcpdump running overnight having cleared conntrack of all port 25 and 587 connections. This morning, there's a whole bunch of new entries on conntrack. Digging through the tcpdump and logs, it seems what is going on is: public interface dmz interface origin -> mailserver SYN origin -> mailserver SYN mailserver -> origin SYNACK mailserver -> origin SYNACK origin -> mailserver ACK mailserver -> origin RST mailserver -> origin SYNACK mailserver -> origin SYNACK mailserver -> origin SYNACK mailserver -> origin SYNACK mailserver -> origin SYNACK mailserver -> origin SYNACK mailserver -> origin SYNACK mailserver -> origin SYNACK ... Here is an example from the public interface: 09:52:36.599398 IP 103.14.225.112.63461 > 78.32.30.218.587: Flags [SEW], seq 3387227814, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0 09:52:36.599893 IP 78.32.30.218.587 > 103.14.225.112.63461: Flags [S.], seq 816385329, ack 3387227815, win 64240, options [mss 1452,nop,nop,sackOK,nop,wscale 7], length 0 09:52:36.820464 IP 103.14.225.112.63461 > 78.32.30.218.587: Flags [.], ack 1, win 260, length 0 09:52:36.820549 IP 78.32.30.218.587 > 103.14.225.112.63461: Flags [R], seq 816385330, win 0, length 0 09:52:37.637548 IP 78.32.30.218.587 > 103.14.225.112.63461: Flags [S.], seq 816385329, ack 3387227815, win 64240, options [mss 1452,nop,nop,sackOK,nop,wscale 7], length 0 and the corresponding trace on the mailserver: 09:52:36.599729 IP 103.14.225.112.63461 > 78.32.30.218.587: Flags [SEW], seq 3387227814, win 8192, options [mss 1452,nop,wscale 8,nop,nop,sackOK], length 0 09:52:36.599772 IP 78.32.30.218.587 > 103.14.225.112.63461: Flags [S.], seq 816385329, ack 3387227815, win 64240, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0 09:52:37.637421 IP 78.32.30.218.587 > 103.14.225.112.63461: Flags [S.], seq 816385329, ack 3387227815, win 64240, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0 So, my first observation is that conntrack is reacting to the ACK packet on the public interface, and marking the connection established, but a firewall rule is rejecting the connection when that ACK packet is received by sending a TCP reset. It looks like conntrack does not see this packet, and also conntrack does not see the SYNACK retransmissions (which is odd, because it saw the first one.) As to why we're responding with a TCP reset to the ACK packet, it's because iptables is hitting a reject rule as the IP address has been temporarily banned due to preceding known spammer signatures a few seconds before. I probably ought to pick up on the initial SYN rather than the 3rd packet of the connection... but even so, I don't think conntrack should be missing the TCP reset from the reject rule. The rule path that leads to the reject rule is currently: -A TCP -p tcp -m multiport --dports 25,587 -m conntrack --ctstate ESTABLISHED -j TCP-smtp-in -A TCP-smtp-in -p tcp -m set --match-set ip4-banned-smtp src -j TCP-smtp-s -A TCP-smtp-s -j SET --add-set ip4-banned-smtp src --exist --timeout N -A TCP-smtp-s -p tcp -j REJECT --reject-with tcp-reset (I've omitted the timeout.) There definitely seems to be a change in behaviour - looking back to the logs prior to upgrading to 6.1, there were never any conntrack table overflows, and that older kernel had been running for hundreds of days. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!