When using an ipvs service in combination with SNAT and a NOTRACK rule, specific circumstances can lead to TCP ports of packets being changed mid-stream, which results in successful connections that no data can be effectively sent over.
Consider the following example: ``` root@router:~# sysctl net.ipv4.vs.conntrack net.ipv4.vs.conntrack = 1 root@router:~# iptables -t raw -L -n -v Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)pkts bytes target prot opt in out source destination 24 1296 CT tcp -- enp1s0 * 0.0.0.0/0 10.0.0.1 tcp dpt:1234 NOTRACK
root@router:~# iptables -t nat -L -n -v Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)pkts bytes target prot opt in out source destination
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)pkts bytes target prot opt in out source destination
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)pkts bytes target prot opt in out source destination
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)pkts bytes target prot opt in out source destination 4 240 SNAT all -- * * 10.0.0.0/24 10.0.1.0/24 to:10.0.1.1
Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)pkts bytes target prot opt in out source destination
root@router:~# ipvsadm -L -n IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 10.0.0.1:1234 rr -> 10.0.1.2:1234 Masq 1 0 0 -> 10.0.1.3:1234 Masq 1 0 0 ``` The reals servers are running ```socat TCP4-LISTEN:1234,fork 'EXEC:sh -c echo${IFS}hello;read${IFS}r${IFS}L;sleep${IFS}1'
```We dump the network traffic between the router and client on the ipvs router as follows:
```root@router:~# tcpdump -pXXni enp1s0 icmp or tcp -w /tmp/ipvs_port_reuse.pcap tcpdump: listening on enp1s0, link-type EN10MB (Ethernet), capture size 262144 bytes
^C16 packets captured 16 packets received by filter 0 packets dropped by kernel ```While the capture is running, we run the following commands on a client to trigger the buggy behavior:
``` root@debian:~# netcat -p 4321 -v 10.0.01 1234 Connection to 10.0.01 1234 port [tcp/*] succeeded! hello ^C root@debian:~# sleep 60 root@debian:~# netcat -p 4321 -v 10.0.01 1234 Connection to 10.0.01 1234 port [tcp/*] succeeded! ^C root@debian:~# ```We can see that on the first connection attempt we successfully receive a reply with payload from the server and then terminate the connection with Ctrl+C. Then we wait 60 seconds, which is necessary for the previous connection to move out of the TIME_WAIT state. Afterwards we open another connection, reusing the same src port as on the first connection and don't receive a reply from the server. The captured traffic shows, that after the three-way handshake for the second TCP connection, packets from the router to the clients use another server port than the one used for the initiation of the connection.
Regards Sven
Attachment:
ipvs_port_reuse.pcap
Description: application/vnd.tcpdump.pcap