Noel, Thanks for the response. We are already using the connmark plugin in Strongswan. Using that plugin we are able to correctly configure and establish IPsec SAs that map to individual PAT systems. Data transfer through those SAs works fine except in the one case described where multiple PAT systems attempt a TCP or UDP dialogue to a common port of a public system with a common source port. I tried the SNAT you suggested but modified for the environment: iptables -t nat -A INPUT -m policy --pol ipsec --dir in -j SNAT --to-source 20.20.20.10 --random The source port was modified but it used the same source port for both inbound frames from each PAT endpoint. So the problem remained. The second TCP attempt succeeded but replaced the first TCP connection. The inbound, transformed frames maintain the mark set by the connmark. This means we can associate each inbound frame came from which PAT system. So I tried to SNAT just the port on marked inbound frames with the following: iptables -t nat -A INPUT -s 20.20.20.10 -m mark ! --mark 0x00 -j SNAT --to-source 20.20.20.10 --random The source ports were changed but the new source port was again the same for each frame from both PAT systems. I then tried to create two SNAT entries, one for each mark that was created by the connmark plugin: iptables -t nat -A INPUT -s 20.20.20.10 -m mark --mark 0x01 -j SNAT --to-source 20.20.20.10 --random iptables -t nat -A INPUT -s 20.20.20.10 -m mark --mark 0x02 -j SNAT --to-source 20.20.20.10 --random Again, the source ports were changed but the new source port was again the same for each frame from both PAT systems. Finally, I decided to create a TCP dialogue specific SNAT entry with an assigned modified port for each PAT system. That way I knew the original source ports from each PAT system would be mapped to unique source ports: iptables -t nat -A INPUT -s 20.20.20.10/32 -p tcp -m mark --mark 0x1 -m tcp --sport 5000 -j SNAT --to-source 20.20.20.10:5001 iptables -t nat -A INPUT -s 20.20.20.10/32 -p tcp -m mark --mark 0x2 -m tcp --sport 5000 -j SNAT --to-source 20.20.20.10:5002 When I attempted the test, the first TCP connection showed the mapped source port of 5001. However, the second TCP connection again replaced the first. I double checked with netstat and verified that there was only one TCP dialogue with the modified source port (5002) mapped from the second PAT system. This does not make sense but that is what happened. I think the issue is that the stack does not map the restored reply frame back to the correct IPsec SA for a particular PAT system but only in this scenario when the original post-transform inbound frames are identical from multiple PAT peers. Steve >I forgot to mention, that you can perform nat in INPUT and randomize the source ports with SNAT. >-t nat -A INPUT -m policy --pol ipsec --dir in -j SNAT --random > >Check the man page for iptables-extensions for details. >*nat INPUT isn't in the colourful graph that Jan Engelhardt made, but it exists. > >On 12.07.2017 21:41, Noel Kuntze wrote: >> Hello Steven, >> >> Take a look at what the connmark plugin[1] of strongSwan does. >> I think doing the same fixes your problem. Or switch to strongSwan >> right away and use the plugin. >> >> [1] https://wiki.strongswan.org/projects/strongswan/wiki/Connmark >> >> Kind regards >> >> Noel >> >> On 12.07.2017 21:35, Rajcan, Steven L wrote: >>> Hello, >>> >>> I have created IPsec policies using transport mode that allow systems behind >>> a NAT(PAT) router to connect to a public system. The issue I am having is on >>> a public system with established IPsec tunnels to systems behind a PAT >>> (Port-Address-Translation) router. These routers multiplex systems behind a >>> single IPV4 address. The IPsec SAs are created properly and I am able to >>> send data from these PAT system to the public system through the IPsec >>> tunnels in most scenarios. However, it is possible that frames sent from two >>> or more PAT systems arrive at the public server stack with the same source >>> port, same source IP, same destination port, and same destination IP. This >>> occurs because the PAT router cannot modify the original TCP or UDP payload >>> encapsulated in the ESP frame. In these scenarios, the stack on the public >>> system gets confused and cannot map replies to those frame back through the >>> correct IPsec tunnel of the PAT system. >>> >>> Consider two PAT systems attempting a TCP connection to the same public >>> server but each happens to use the same local port of 45000. >>> PAT1 system IP addr: 192.168.0.1 >>> PAT2 system IP addr 192.168.0.2 >>> PAT router public IP addr 20.20.20.10 >>> Public system IP addr: 20.20.20.20 >>> Note that the original TCP frame, sent by the PATx system is encapsulated in >>> a UDP/ESP frame and is therefore, not modified by the PAT router. >>> PAT1 [192.168.0.1:45000,20.20.20.20:80] --> PAT Router [ >>> 10.10.10.10] -> public system [20.20.20.20:80] >>> PAT2 [192.168.0.2:45000,20.20.20.20:80] --> PAT Router [ >>> 10.10.10.10] -> public system [20.20.20.20:80] >>> The original IP of the PAT systems is NAT'ed to that of the PAT router and >>> the post-transform, inbound frames arriving at the public system stack are >>> identical for both endpoints. [10.10.10.10:4500,20.20.20.20:90] >>> >>> When testing this scenario, the first PAT system establish the TCP >>> connection properly. The second PAT system also connects but only a single >>> TCP connection is established on the public system. An iptables log seems to >>> indicate that the second TCP connection replaces the first. >>> >>> We have discovered that other platforms handle this scenario automatically >>> on the public system by modifying the source port on the inbound, >>> post-transform frame before it is sent up the stack. Thus the stack sees a >>> unique frame for every TCP and UDP dialogue with the PAT endpoints. The >>> reply frame then contains the modified source port which is restored by the >>> OS to the original source port and is directed back through the original >>> IPsec tunnel. >>> >>> So the questions is, can the Linux kernel do the same or something similar? >>> I looked at the xfrm routines and could not find anything that indicates >>> that it could. >>> >>> Please note that using tunnel mode instead of transport mode is not an >>> option for our situation. >>> >>> Any help would be appreciated. >>> Thanks >>> >>> Steve Rajcan >>> mailto:Steven.Rajcan@xxxxxxxxxx
Attachment:
smime.p7s
Description: S/MIME cryptographic signature