Hi Florian, Thanks for the quick reply. On Fri, Jan 13, 2023 at 12:38:08AM +0100, Florian Westphal wrote: > Russell King (Oracle) <linux@xxxxxxxxxxxxxxx> wrote: > > Hi, > > > > I've noticed that my network at home is rather struggling, and having > > done some investigation, I find that the router VM is dropping packets > > due to lots of: > > > > nf_conntrack: nf_conntrack: table full, dropping packet > > > > I find that there are about 2380 established and assured connections > > with a destination of my incoming mail server with destination port 25, > > and 2 packets. In the reverse direction, apparently only one packet was > > sent according to conntrack. E.g.: > > > > tcp 6 340593 ESTABLISHED src=180.173.2.183 dst=78.32.30.218 > > sport=49694 dport=25 packets=2 bytes=92 src=78.32.30.218 > > dst=180.173.2.183 sport=25 dport=49694 packets=1 bytes=44 [ASSURED] > > use=1 > > Non-early-evictable entry that will expire in ~4 days, so not really > surprising that this eventually fills the table. > > I'd suggest to reduce the > net.netfilter.nf_conntrack_tcp_timeout_established > sysctl to something more sane, e.g. 2 minutes or so unless you need > to have longer timeouts. > > But this did not change, so not the root cause of this problem. I'll hold off trying that for now - I do tend to have some connections that may be idle... > > However, if I look at the incoming mail server, its kernel believes > > there are no incoming port 25 connetions, which matches exim. > > > > I hadn't noticed any issues prior to upgrading from 5.16 to 6.1 on the > > router VM, and the firewall rules have been the same for much of > > 2021/2022. > > > > Is this is known issue? Something changed between 5.16 and 6.1 in the > > way conntrack works? > > Nothing that should have such an impact. > > Does 'sysctl net.netfilter.nf_conntrack_tcp_loose=0' avoid the buildup > of such entries? I'm wondering if conntrack misses the connection > shutdown or if its perhaps triggering the entries because of late > packets or similar. > > If that doesn't help. you could also check if > > 'sysctl net.netfilter.nf_conntrack_tcp_be_liberal=1' helps -- if it > does, its time for more debugging but its too early to start digging > atm. This would point at conntrack ignoring/discarding fin/reset > packets. I think first I need to work out how the issue arises, since it seems to be behaving normally at the moment. I have for example: $ grep 173.239.196.95 bad-conntrack.log | wc -l 314 and this resolves to 173-239-196-95.azu1ez9l.com. It looks like exim was happy with that, so would have issued its SMTP banner very shortly after the connection was established, but all the entries in the conntrack table show packets=2...packets=1 meaning conntrack only saw the SYN, SYNACK and ACK packets establishing the connection, but not the packet sending the SMTP banner which seems mightily weird. I've just tried this from a machine on the 'net, telneting in to the SMTP port, the conntrack packet counters increase beyond 2/1, and when exim times out the connection, the conntrack entry goes away - so everything seems to work how it should. Digging through the logs, it looks like the first table-full happened twice on Dec 30th, just two and a half days after boot. Then eight times on Jan 10th, and from the 11th at about 11pm, the logs have been sporadically flooded with the conntrack table full messages. I'll try to keep an eye on it and dig out something a bit more useful which may help locate what the issue is, but it seems the trigger mechanism isn't something obvious. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!