Le 24/07/2018 à 02:48, Dima Kogan a écrit : > On 19/07/2018 20:27, Adel Belhouanez wrote: > >> What is very important: the router/NAT system should *drop* unknown >> outside incoming packets (thus not generate TCP RST or ICMP >> unreachable errors). If it doesn't drop packets before conntrack >> allow reverse-SNATing them because of the internal outgoing flow, >> then the internal system will give up early and attemps will fail. > > I've been experimenting with traversing Linux NAT as well, so wanted > to chime in on this. > > I can confirm that setting the NAT system to DROP unsolicited > incoming packets is crucial for the NAT traversal to work. I > suspect that the reason is a bit different from what is suggested > above. (For simplicity I'll focus on the UDP case, which I think > is what WebRTC uses.) > > IIUC when an unsolicited incoming datagram arrives at port P of the > NAT box, in the absence of a DROP rule the NAT assumes that the packet > is targeted to a local process on the NAT system itself (rather than > to some host on the internal network). It thus allocates the local > port P to a session between the sender and the local host (the NAT > system itself). From what I've seen in some netfilter documentation, > this is sometimes referred to as a "null binding". Subsequently, when > a host on the internal network sends a datagram from the same internal > port number P, the NAT maps it to a different external port number P', > since port P is already allocated by this "null binding" to the local > host. When this datagram reaches the other party, it will most often > fail to traverse the NAT on the remote side, since it is now coming > from an unexpected port number. > I was focused on TCP, and on TCP the conntrack entry is immediately DESTROYed upon replying with TCP RST, while indeed for the UDP entry, despite the ICMP port unreachable sent, it doesn't destroy the conntrack entry before its default timeout (30s). So After doing: ip netns exec r1 iptables -P INPUT ACCEPT ip netns exec r2 iptables -P INPUT ACCEPT but dropping icmp on the systems behind (so they can keep trying): ip netns exec s1 iptables -A INPUT -p icmp -j DROP ip netns exec s2 iptables -A INPUT -p icmp -j DROP ip netns exec r1 conntrack -E [NEW] udp 17 30 src=192.0.2.2 dst=203.0.113.12 sport=1111 dport=2222 [UNREPLIED] src=203.0.113.12 dst=198.51.100.11 sport=2222 dport=1111 [NEW] udp 17 30 src=203.0.113.12 dst=198.51.100.11 sport=1024 dport=1111 [UNREPLIED] src=198.51.100.11 dst=203.0.113.12 sport=1111 dport=1024 Notice how the sport=2222 was altered into sport=1024, because there's a conflict, because it's considered a new flow again, because of the ICMP host port unreachable (unseen here). 30s later: [DESTROY] udp 17 src=203.0.113.12 dst=198.51.100.11 sport=1024 dport=1111 [UNREPLIED] src=198.51.100.11 dst=203.0.113.12 sport=1111 dport=1024 [DESTROY] udp 17 src=192.0.2.2 dst=203.0.113.12 sport=1111 dport=2222 [UNREPLIED] src=203.0.113.12 dst=198.51.100.11 sport=2222 dport=1111 While dropping TCP RST on systems behind for TCP: ip netns exec s1 iptables -A INPUT -p tcp -m tcp --tcp-flags RST RST -j DROP ip netns exec s2 iptables -A INPUT -p tcp -m tcp --tcp-flags RST RST -j DROP ip netns exec r1 conntrack -E [NEW] tcp 6 120 SYN_SENT src=192.0.2.2 dst=203.0.113.12 sport=1111 dport=2222 [UNREPLIED] src=203.0.113.12 dst=198.51.100.11 sport=2222 dport=1111 [DESTROY] tcp 6 src=192.0.2.2 dst=203.0.113.12 sport=1111 dport=2222 [UNREPLIED] src=203.0.113.12 dst=198.51.100.11 sport=2222 dport=1111 [NEW] tcp 6 120 SYN_SENT src=203.0.113.12 dst=198.51.100.11 sport=2222 dport=1111 [UNREPLIED] src=198.51.100.11 dst=203.0.113.12 sport=1111 dport=2222 [DESTROY] tcp 6 src=203.0.113.12 dst=198.51.100.11 sport=2222 dport=1111 [UNREPLIED] src=198.51.100.11 dst=203.0.113.12 sport=1111 dport=2222 [NEW] tcp 6 120 SYN_SENT src=192.0.2.2 dst=203.0.113.12 sport=1111 dport=2222 [UNREPLIED] src=203.0.113.12 dst=198.51.100.11 sport=2222 dport=1111 [DESTROY] tcp 6 src=192.0.2.2 dst=203.0.113.12 sport=1111 dport=2222 [UNREPLIED] src=203.0.113.12 dst=198.51.100.11 sport=2222 dport=1111 [NEW] tcp 6 120 SYN_SENT src=203.0.113.12 dst=198.51.100.11 sport=2222 dport=1111 [UNREPLIED] src=198.51.100.11 dst=203.0.113.12 sport=1111 dport=2222 [DESTROY] tcp 6 src=203.0.113.12 dst=198.51.100.11 sport=2222 dport=1111 [UNREPLIED] src=198.51.100.11 dst=203.0.113.12 sport=1111 dport=2222 [...] the ports are not altered (but it still won't work because of the timing). That means, in the adverse case of ACCEPT, TCP hole punching is allowed more than one attempt to synchronize, while UDP can succeed only once per pair of ports if: - the one simultaneous attempt is synchronized (by NTP and a very precise simultaneous emission agreement), - there's enough delay on internet to allow imprecision on the synchronized attempt. That again can be reproduced with tc and netem from my previous example (to add a delay of 1s on "internet" in each direction, because it's a test with fingers): ip netns exec in tc qdisc add dev left0 root netem delay 1000ms ip netns exec in tc qdisc add dev right0 root netem delay 1000ms The UDP socat command can now work if using the first text is sent within 1s of each other: [NEW] udp 17 30 src=192.0.2.2 dst=203.0.113.12 sport=1111 dport=2222 [UNREPLIED] src=203.0.113.12 dst=198.51.100.11 sport=2222 dport=1111 [UPDATE] udp 17 30 src=192.0.2.2 dst=203.0.113.12 sport=1111 dport=2222 src=203.0.113.12 dst=198.51.100.11 sport=2222 dport=1111 [UPDATE] udp 17 180 src=192.0.2.2 dst=203.0.113.12 sport=1111 dport=2222 src=203.0.113.12 dst=198.51.100.11 sport=2222 dport=1111 [ASSURED] regards, Adel Belhouane. -- To unsubscribe from this list: send the line "unsubscribe netfilter" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html