RE: Distinguishing NAT(PAT) inbound frames when using IPsec Transport mode from multiple NAT(PAT) systems

"Rajcan, Steven L" <Steven.Rajcan@xxxxxxxxxx> · Fri, 14 Jul 2017 18:06:17 +0000

Noel,

Thanks for the response. We are already using the connmark plugin in
Strongswan. Using that plugin we are able to correctly configure and
establish IPsec SAs that map to individual PAT systems. Data transfer
through those SAs works fine except in the one case described where multiple
PAT systems attempt a TCP or UDP dialogue to a common port of a public
system with a common source port.

I tried the SNAT you suggested but modified for the environment:
	iptables -t nat -A INPUT -m policy --pol ipsec --dir in -j SNAT
--to-source 20.20.20.10 --random
The source port was modified but it used the same source port for both
inbound frames from each PAT endpoint. So the problem remained. The second
TCP attempt succeeded but replaced the first TCP connection.

The inbound, transformed frames maintain the mark set by the connmark. This
means we can associate each inbound frame came from which PAT system. So I
tried to SNAT just the port on marked inbound frames with the following:
	iptables -t nat -A INPUT -s 20.20.20.10 -m mark ! --mark 0x00 -j
SNAT --to-source 20.20.20.10 --random
The source ports were changed but the new source port was again the same for
each frame from both PAT systems. 

I then tried to create two SNAT entries, one for each mark that was created
by the connmark plugin:
	iptables -t nat -A INPUT -s 20.20.20.10 -m mark --mark 0x01 -j SNAT
--to-source 20.20.20.10 --random
	iptables -t nat -A INPUT -s 20.20.20.10 -m mark --mark 0x02 -j SNAT
--to-source 20.20.20.10 --random
Again, the source ports were changed but the new source port was again the
same for each frame from both PAT systems.

Finally, I decided to create a TCP dialogue specific SNAT entry with an
assigned modified port for each PAT system. That way I knew the original
source ports from each PAT system would be mapped to unique source ports:
	iptables -t nat -A INPUT -s 20.20.20.10/32 -p tcp -m mark --mark 0x1
-m tcp --sport 5000 -j SNAT --to-source 20.20.20.10:5001
	iptables -t nat -A INPUT -s 20.20.20.10/32 -p tcp -m mark --mark 0x2
-m tcp --sport 5000 -j SNAT --to-source 20.20.20.10:5002
When I attempted the test, the first TCP connection showed the mapped source
port of 5001. However, the second TCP connection again replaced the first. I
double checked with netstat and verified that there was only one TCP
dialogue with the modified source port (5002) mapped from the second PAT
system. This does not make sense but that is what happened. 

I think the issue is that the stack does not map the restored reply frame
back to the correct IPsec SA for a particular PAT system but only in this
scenario when the original post-transform inbound frames are identical from
multiple PAT peers. 

Steve

 >I forgot to mention, that you can perform nat in INPUT and randomize the
source ports with SNAT.
>-t nat -A INPUT -m policy --pol ipsec --dir in -j SNAT --random
>
>Check the man page for iptables-extensions for details.
>*nat INPUT isn't in the colourful graph that Jan Engelhardt made, but it
exists.
>
>On 12.07.2017 21:41, Noel Kuntze wrote:
>> Hello Steven,
>>
>> Take a look at what the connmark plugin[1] of strongSwan does.
>> I think doing the same fixes your problem. Or switch to strongSwan
>> right away and use the plugin.
>>
>> [1] https://wiki.strongswan.org/projects/strongswan/wiki/Connmark
>>
>> Kind regards
>>
>> Noel
>>
>> On 12.07.2017 21:35, Rajcan, Steven  L wrote:
>>> Hello,
>>>
>>> I have created IPsec policies using transport mode that allow systems
behind
>>> a NAT(PAT) router to connect to a public system. The issue I am having
is on
>>> a public system with established IPsec tunnels to systems behind a PAT
>>> (Port-Address-Translation) router. These routers multiplex systems
behind a
>>> single IPV4 address. The IPsec SAs are created properly and I am able to
>>> send data from these PAT system to the public system through the IPsec
>>> tunnels in most scenarios. However, it is possible that frames sent from
two
>>> or more PAT systems arrive at the public server stack with the same
source
>>> port, same source IP, same destination port, and same destination IP.
This
>>> occurs because the PAT router cannot modify the original TCP or UDP
payload
>>> encapsulated in the ESP frame. In these scenarios, the stack on the
public
>>> system gets confused and cannot map replies to those frame back through
the
>>> correct IPsec tunnel of the PAT system.
>>>
>>> Consider two PAT systems attempting a TCP connection to the same public
>>> server but each happens to use the same local port of 45000.
>>>     PAT1 system IP addr:               192.168.0.1
>>>     PAT2 system IP addr                192.168.0.2
>>>     PAT router public IP addr        20.20.20.10
>>>     Public system IP addr:              20.20.20.20
>>> Note that the original TCP frame, sent by the PATx system is
encapsulated in
>>> a UDP/ESP frame and is therefore, not modified by the PAT router.
>>>                 PAT1 [192.168.0.1:45000,20.20.20.20:80] --> PAT Router [
>>> 10.10.10.10] -> public system [20.20.20.20:80]
>>>                 PAT2 [192.168.0.2:45000,20.20.20.20:80] --> PAT Router [
>>> 10.10.10.10] -> public system [20.20.20.20:80]
>>> The original IP of the PAT systems is NAT'ed to that of the PAT router
and
>>> the post-transform, inbound frames arriving at the public system stack
are
>>> identical for both endpoints. [10.10.10.10:4500,20.20.20.20:90]
>>>
>>> When testing this scenario, the first PAT system establish the TCP
>>> connection properly. The second PAT system also connects but only a
single
>>> TCP connection is established on the public system. An iptables log
seems to
>>> indicate that the second TCP connection replaces the first.
>>>
>>> We have discovered that other platforms handle this scenario
automatically
>>> on the public system by modifying the source port on the inbound,
>>> post-transform frame before it is sent up the stack. Thus the stack sees
a
>>> unique frame for every TCP and UDP dialogue with the PAT endpoints. The
>>> reply frame then contains the modified source port which is restored by
the
>>> OS to the original source port and is directed back through the original
>>> IPsec tunnel.
>>>
>>> So the questions is, can the Linux kernel do the same or something
similar?
>>> I looked at the xfrm routines and could not find anything that indicates
>>> that it could.
>>>
>>> Please note that using tunnel mode instead of transport mode is not an
>>> option for our situation.
>>>
>>> Any help would be appreciated.
>>> Thanks
>>>
>>> Steve Rajcan
>>> mailto:Steven.Rajcan@xxxxxxxxxx

Attachment:
smime.p7s

Description: S/MIME cryptographic signature