Dexuan Cui <decui@xxxxxxxxxxxxx> wrote: > We're trying to figure out how a Docker NAT bridge occasionally sends out an > undesired TCP RST packet, which aborts the TCP connection unexpectedly. conntrack never sends tcp reset packets. > A: (the docket instance): 172.17.0.2 > B: (the bridge): 10.35.4.56 > C: (the remote server): 40.121.XX.YY. > > 1) A sends a TCP packet, through B, to C; > > 2) C's reply reaches to B; > > 3) B immediately sends out a TCP RST packet to C; > > 4) A thinks C doesn't receive the packet, so A re-transmits the packet 7 > times, through B; B still does the normal NAT translation, and forwards > all the 7 packets to C; there is no response from C (I suppose C ignores > the packets); This would imply that the conntrack entry is still in place. > 5) A closes the connection by sending a TCP FIN packet; B still does the > normal NAT translation, and forwards the packet to C; there is no > response from C. > > We need to figure out what happens in step 3. It looks the bridge thinks > something bad happened so it tries to abort the TCP connection? It looks like the packet is pushed up to the ip stack and is routed to localhost, so it ends on bridge input path rather than entering the bridge forward path. The only other explanation is that the iptables ruleset makes use of 'REJECT --reject-with tcp-reset' and that triggers for some reason. > There are not a lot of concurrent TCP connections: usually there are only > about 5 concurrent TCP connections, so I don't think the conntrack module > runs out of the tracking table entries. We have checked "conntrack -L" and > there are only about 700 entries. This would have other symptoms;,we don't blindly zap existing assured entries. > Can you please recommend some tools that can trace how exactly the TCP > packet flow is processed by iptables/conntrack, especially in the case > of NAT? Are we talking about DNAT or SNAT? (I'd guess its SNAT/MASQUERADE), so NAT should not even have any effect wrt. forwarding decision. > Now I'm studying some tools like ipset, nft and ulogd2. > It looks we're able to log some iptables/conntrack events when tracing > the packet flows, but I'm unsure if we're able to log the event of > the undesired TCP Reset packet here. Normally I'd suggest the TRACE target, however it generates a lot of log messages. I'd suggest to add iptables -I INPUT 1 -p tcp -s 172.17.0.2 -d 40.121.XX.YY -j LOG iptables -I OUTPUT 1 -p tcp --tcp-flags RST RST -d 172.17.02 -j LOG and see if that triggers. Based on the description it should not, as 172.17.0.2 -> 40.121.XX.YY packets are supposed to be forwarded by the bridge.