[LARTC] redundancy and multipath routing.

smohan@xxxxxxxx (S Mohan) · Tue, 12 Aug 2003 08:59:58 +0530

I use a LEAF Bering distribution which is 2.4.18 kernel based. I wanted to
experiment using it for link load balancing and  redundancy and ran up some
hitches. Pointers would be welcome and helpful.

I set up a single machine with 2 ethernet interfaces as per the network
schematic below.

 +----------+ A.B.64.175/26
 |          |-------- eth0--------------- gw A.B.64.134
 | LEAF Box |
 |          |-------- eth1--------------- gw A.B.65.129
 +----------+ A.B.65.131/29

I have a third ethernet port that I  can configure as 192.168.1.1 local LAN
interface.A.B.64.134 is a land link (approx 400ms latency) while A.B.65.129
is a satellite link (approx 750ms latency). The latency was found by
changing default route and pinging the same target IP.

[Case 1]
I first wanted to check if fail over from one interface to another works
using the metrics declarative in the routes for priority of routes.

The commands I gave and the outputs are as under:

# ip ad flush dev eth0
# ip ad flush dev eth1
# ip ad flush dev eth2
# ip ad add dev eth0 A.B.64.175/26
# ip ad add dev eth1 A.B.65.131/29
# ip add ro default via A.B.64.134 metric 1
# ip add ro default via A.B.65.129 metric 2
# ping W.X.Y.Z

[Result 1]
Pings responds with packet replies if both are connected. If I disconnect
the ethernet cable from eth0, the ping was still going thro'. If I connect
the cable on eth0 and disconnect eth1, ping stops. If I connect back eth1,
ping resumes with the icmp packet count at a much larger number than when it
stopped with the difference in packets shown as lost.

I thought by looking at ping latency, I could make out which link is being
used. Latency was always 750ms.

My surmise:
The originating IP for the ping is taken as A.B.65.131. Thus replies do not
land up if eth1 is not connected even though packets go out of eth0. If eth1
was connected, it was used as a preferred route as originating IP was from
this subnet.

[Question 1]
Am I wrong? Is my interpretation of metrics wrong?

[Case 2]
I removed the default route and added a multipath route using commands as
under:

# ip ro del default
# ip ro del default
# ip ro add default nexthop via A.B.64.134 dev eth0 weight 1 \
                    nexthop via A.B.65.129 dev eth1 weight 1

[Result 2]
Giving a ping here had the same results as in [Result 1]. I expected each
ping packet to have different latency switching between 450 and 750ms. Did
not happen. Latency was 750ms consistently.

[Case 3]
The above weight go by flows and not packets. Maybe a single single ping is
treated as one flow. I changed the multipath to include equalize using
commands as under:

# ip ro del default
# ip ro add default equalize nexthop via A.B.64.134 dev eth0 weight 1 \
                             nexthop via A.B.65.129 dev eth1 weight 1

[Result 3]
Same as [Result 1] and [Result 2]. Atleast here I should have got latencies
switching between 450 and 750ms for alternating ICMP requests.

[Questions]
1. Is this method of testing correct?
2. Are there any other utilities/ methodologies that I can use to test this
better?
3. Is expecting load balancing/ redundancy to happen for a single flow
wrong?

TIA
Mohan

[LARTC] redundancy and multipath routing.

Linux Advanced Routing and Traffic Control