complex multipath installment and a strange behavior

Luca Dionisi <luca.dionisi@xxxxxxxxx> · Thu, 27 Jan 2011 11:13:49 +0100

I am trying to use IPIP tunnels and multipath access to Internet
inside a LAN. Before actually deploying I am testing in a virtual
network environment, netkit.
I am experiencing a strange behavior, I do not seem to figure out the
cause/solution, please help.

The testbed simulates these 4 segments (if you use a monospaced font
it should appear right):

 I -------------------
     |      |      |
    eth1   eth0   eth1
     |      |      |
    AP1    NI     AP2

 A -------------------
     |      |      |
    eth0   eth0   eth0
     |      |      |
    AP1    N1     N2

 B ------------
     |      |
    eth1   eth1
     |      |
    N2     N3

 C --------------------------
     |      |      |      |
    eth0   eth0   eth0   eth0
     |      |      |      |
    N3     N4     N5     AP2

The segment I represents the Internet. NI is a destination host in
Internet realm.
AP1 and AP2 represent 2 ap/routers. They do NAT to their clients.

The LAN includes the other segments, A B and C.
In segment A, AP1 is a dhcp server and only N1 is a client.
In segment C, AP2 is a dhcp server and N4 and N5 are 2 clients.
N2 and N3 have 2 nics; with eth1 they are attached to segment B; they
do routing for the LAN.

AP1 and AP2 (and obviously NI) have a public address at eth1. Further
they both have address 192.168.1.1 at eth0 and a route to
192.168.1.0/24.
N1, N2, N3, N4, N5 have an address in subnet 10.1.1.0/24.
N1 has an address in 192.168.1.0/24 and a default route via AP1;
further it enables a IPIP tunnel interface.
N4 idem.

N3 doesn't have an address in 192.168.1.0/24;
it has a IPIP tunnel to N1;
it has a IPIP tunnel to N4;
its default route is a multipath via N1 or via N4.

N5, instead, has an address in 192.168.1.0/24;
it also has a IPIP tunnel to N1;
it has a IPIP tunnel to N4;
its default route is a multipath via 192.168.1.1 (AP2), or via N1, or via N4.

A ping from N3 to NI works flawlessly. By watching the TTL I can see
that the packets pass at times via N1, at times via N4.
A ping from N5 to NI works flawlessly. By watching the TTL I can see
that the packets pass at times via N1, at times via N4, at times
immediately via AP2.

The problem is with TCP connections. I wrote a simple client/server
app to test it. The server listens, the client connects to the server
and then the two continuously exchange an HELO packet each second.
The server runs in NI. The client in N3 works flawlessly. I can launch
simultaneously tens of them and let them go for tens of minutes, no
problem.
The client in N5 works fine every now and then, but some time it
cannot make the connection.

I add that the problem should not be searched in a TCP connection
being reset by the server when it sees a different src IP. In fact the
correct commands (iptables with CONNMARK, ip rule, etc etc) have been
included. Otherwise, also N3 should experience problems.

The only difference between N3 (ok) and N5 (not good) is the default route.
N3:
ip route add default \
          nexthop via 10.1.1.1 dev ntk-to-inet-0 weight 100 onlink \
          nexthop via 10.1.1.4 dev ntk-to-inet-1 weight 100 onlink
N5:
ip route add default \
          nexthop via 192.168.1.1 dev eth0 weight 100 \
          nexthop via 10.1.1.1 dev ntk-to-inet-0 weight 70 onlink \
          nexthop via 10.1.1.4 dev ntk-to-inet-1 weight 30 onlink

Perhaps the difference is that N5, when a connection is made via
10.1.1.1 or via 10.1.1.4, has to specify its address 10.1.1.5.
Instead, when a connection is made via 192.168.1.1, it has to specify
its address 192.168.1.40.
On the contrary, N3 will always use its address 10.1.1.3.

But, in the "nexthop" part you cannot specify a "src" option.
Anyway, the correct "src" option is already specified in the routes to
reach 10.1.1.1 and 192.168.1.1, so I think the kernel should know what
to do.

Can anyone make a guess of what the real problem / solution could be?
I can send the testbed, it should be easy to reproduce if you have a
working netkit environment already installed.

Regards
Luca
--
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html