Re: A question about multipath routing...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 17, 2001 at 09:40:17PM +0200, Something Else wrote:
> I have read one of your letters in the linux-net archive, 
> that was posted in 1999(quite a time ago), and it was about 
> using two isps at the same time.

Here's an answer, CC'd to linux-net so that the many archives can hold
a copy...

> We have had the same problem in our school for years, and 
> I've been looking for a solution, but haven't found yet.
> Would you be so kind to give me a hint about how to set up 
> a linux machine (ie a Debian potato) to handle two isps 
> with masqerading?

There may be some new features in 2.4.x kernels, but I haven't started
using 2.4 on any machine on a network perimeter yet, so I know nothing
about them.  ;-)  So this assumes you're using kernel 2.2.

You need these options in a 2.2 kernel:

	CONFIG_NET_SECURITY
	CONFIG_FILTER
	CONFIG_IP_ADVANCED_ROUTER
	CONFIG_NETLINK
	CONFIG_IP_MULTIPLE_TABLES
	CONFIG_IP_ROUTE_MULTIPATH
	CONFIG_IP_FIREWALL
	CONFIG_IP_FIREWALL_NETLINK
	CONFIG_NETLINK_DEV
	CONFIG_IP_MASQUERADE

You also need the 'iproute' Debian package, or equivalent on other
distributions.

I assume you have three network interfaces:

	eth0	- your internal private network
	eth1	- connected to ISP #1
	eth2	- connected to ISP #2

If your ISP uses a non-Ethernet device (e.g. a PPPoE link, an ISDN card,
or some kind of V.35 or non-standard serial connector), then it will
work in cases where both ISP's are up at the same time, but you won't
have a few useful features like automatic dead gateway detection.  It is
still possible to detect failure of one or both ISP's in user-space and
adjust accordingly with a simple shell script--this solution is OK if
you have only a handful of feeds.

I also assume you know how to set up a single-feed masquerading gateway,
and therefore I'll limit this to a discussion of the difference between
a single-feed gateway and a multi-feed gateway.

Before I get into the guts of how it works, here's a high-level overview.
There are three ways you can use two Internet feeds:

	1.  Bit-wise load-balancing:  This is used to e.g. stick multiple
	ISDN channels together to form a really fast link out of several
	slow ones.  This is usually useful only if the two points at the
	ends of all the links are the same--i.e. it can't be used with
	two different ISP's.  Also known as "bonding" or "multi-link".

	2.  Packet-wise load balancing:  This is useful if you can
	get both ISP's to co-operate to explicitly connect the same
	IP address to the Internet via their own independent routes.
	In this case you use something like sch_teql (the TEQL scheduler)
	to create a virtual device that distributes your packets across
	many network interfaces.

	Generally you can't use this if you have two different ISP's
	with a "consumer" service level; however, you _can_ use this if
	you tunnel all the packets via IPIP or CIPE.

	I use this to generate load-balanced links between my laptop and
	home machine, by distributing the packets across two different
	CIPE tunnels, each of which uses one of the IP's of my home
	machine.

	3.  Equal-Cost Multi-Path:  A more accurate name would be
	"destination address-based load balancing".  Normally, when Linux
	wants to find a route to an IP address, it looks in a cache of
	routes for performance.  If a destination IP is not in the cache,
	Linux normally consults the routing table, determines the 
	appropriate route for the IP address, and places this route (i.e.
	a destination device and hardware address tuple) in the cache.
	ECMP simply adds one wrinkle to this procedure, by allowing the 
	resulting cached tuple to be chosen non-deterministically from
	one of several equal (or optionally weighted) options.

	In the two-ISP case, what happens with ECMP is that every external
	IP address you connect to uses either ISP#1 or ISP#2.

	There is another feature with the ECMP implementation in Linux:
	if your upstream network interfaces use ARP (e.g. they are
	running the conventional IP-over-Ethernet), then if one of the
	interfaces should die, ECMP will automatically drop the dead
	interface.

If you have two "consumer" Internet services from two different ISP's,
and you want to talk to random sites on the Internet, then Equal-Cost
Multi-Path is almost the best you can do (the best you can do is have
each TCP connection choose one of the routes non-deterministically, 
as opposed to the current 2.2 ECMP implementation which can only generate
distinct routes for unique IP addresses).

There are two basic problems to solve to make IP masq to work in this
setup, and each of these problems has two parts:

	1a.  How to get your packets to the outside world,

	1b.  How the outside world replies to you,

	2a.  How the outside world sends packets to you,

	2b.  How you reply to the outside world.

"Normal" IP routing (using a single routing table keyed by the destination
IP address) is able to solve problems 1b, and 2a, and ECMP solves
problem 1a.  So far, problem 2b is not solved at all.  Linux Policy
Routing allows you to select between multiple IP routing tables based
on the _source_ address of the packet.  This solves problem 2b.

So...in the end, it all looks like this:

	# First, set up your various Internet interfaces.  Do NOT set up 
	# any default gateway; that will be done below.  The following
	# assumes all of your network interfaces are already up
	# and running.

	# Create three routing tables, in addition to the default,
	# which route packets depending on the source IP addresses:

	# table 10 is for the private network behind the gateway
	# IP 10.x.x.x, all on one LAN.  We put this first to get
	# it out of the way.
	ip rule pref 10 to 10.0.0.0/8 table 10
	ip route add 10.0.0.0/8 table 10 dev eth0

	# table 20 is for ISP #1, IP 1.2.3.4, gateway 1.2.3.1
	ip rule pref 20 from 1.2.3.4 table 20
	ip route add default table 20 via 1.2.3.1

	# table 30 is for ISP #2, IP 2.3.4.5, gateway 2.3.4.1
	ip rule pref 20 from 2.3.4.5 table 30
	ip route add default table 30 via 2.3.4.1

	# The default routing table is used if none of the above tables apply.
	# If your ISP's have servers that authenticate by originating IP address,
	# (e.g. SMTP or NNTP servers) you will want to explicitly list them here.
	ip route add 1.2.3.0/24 dev eth1
	ip route add 2.3.4.0/24 dev eth2

	# The default route in the default routing table 
	# uses ECMP to choose upstream routers
	ip route add default nexthop via 1.2.3.1 nexthop via 2.3.4.1

	# If you have PPPoE on one feed, you'll need to do something like
	# this instead:
	# ip route add default nexthop via 1.2.3.1 nexthop dev ppp0

	# Make it all happen.  IMPORTANT!  The above commands do NOT
	# flush the route cache!
	ip route flush cache

IP masquerading is done normally; however, it fails in one important way:
UDP masqueraded "sessions" can change outgoing network interface but
don't change the masqueraded source IP address.  If your ISP filters
out packets with off-subnet source IP's, it means your UDP masqueraded
packets will be filtered out inside the ISP.  You can work around this by
keeping the UDP masquerade timeout as short as possible--unfortunately,
if there is a way to flush the IP masquerading table, I don't know what
it is.  TCP has the same problem, but you can work around it by simply
closing the dead TCP connection and opening a new one.  UDP protocols,
on the other hand, generally use the same port on the client side over
and over again...

So what happens if we try to send a packet now?  

	1.  If the packet has a destination address on the private
	masqueraded network (above, 10.0.0.0/8), then we send it to the
	private network interface (eth0).

	2.  If the packet has a known source IP address (i.e. from a
	socket that has been bound with bind(), which is true of most
	server sockets), then it is sent via eth1 or eth2 depending on
	its source IP address.

	3.  If the packet does not have a known source IP address
	(i.e. from a socket that has not been bound with bind(), or
	from a new masqueraded connection to the outside world), then
	it falls through all of the routes to the ECMP route at the end
	of the default routing table.  This sends the packet to either
	1.2.3.1 or 2.3.4.1, if there exist ARP table entries for them.

Note that most ISP's restrict access to certain servers they provide, 
especially NNTP servers for Usenet news.  This means that you will have
to add static routes to all such servers in your normal routing table,
to force your machine to contact these servers on directly attached
interfaces.

To achieve load-balancing or failure-avoidance for incoming packets,
you can use any mechanism that selects an IP address for applications
to use.  Round-robin DNS is good enough for most purposes, assuming
that your ISP's are roughly equal in terms of reliability.  You can
also do things like list one IP as preferable to another in MX records,
so that people will try to use ISP#2 before ISP#1 or vice versa.
As mentioned above, you can use IPIP or CIPE to tunnel packets over
a VPN through both ISP's using packet-by-packet load balancing.

If you want real failover, the best way to achieve it is to try pinging
some machine at the ISP--one that has a static route through one
of your ISP interfaces.  You can then use the 'ip route replace' command
to change the default route, like this:

	ip route replace default via 1.2.3.1	# if ISP#2 is down
	ip route replace default via 2.3.4.1	# if ISP#1 is down
	ip route replace default nexthop via ... # if they're both up

Note that due to the IP masquerading bug (mentioned above), this will
effectively destroy all of your existing masqueraded UDP associations.
TCP connections will also be destroyed, but this is unavoidable, and
not due to any masquerade implementation bug.

-- 
Zygo Blaxell (Laptop) <zblaxell@feedme.hungrycats.org>
GPG = D13D 6651 F446 9787 600B AD1E CCF3 6F93 2823 44AD

Attachment: pgp00026.pgp
Description: PGP signature


[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux 802.1Q VLAN]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Git]     [Bugtraq]     [Yosemite News and Information]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux PCI]     [Linux Admin]     [Samba]

  Powered by Linux