Re: [LARTC] Solved: Using more than 1 Internet Line

Christoph Simon <ciccio@xxxxxxxxxxxxxxx> · Tue, 4 Dec 2001 09:05:41 -0200

On Tue, 4 Dec 2001 09:52:49 +0100 (MET)
Arthur van Leeuwen <arthurvl@xxxxxxxxxx> wrote:

> Okay, I've read both the nanohowto and the docs on Julian's patches by now.
> A few things to note: the nanohowto's information is good even without
> Julian's patches, although things will become trickier. One has to do ones
> own link-probing and rerouting from userland. That is very doable however,
> provided you have machines somewhere at the ISP's site that will answer to
> either pings ore traceroutes or somesuch, as you will need answers.

I'm not sure which is the situation Julian wanted to address with the
patches, but having more than one line and using both of them at the
same time, while being able to continue in case of failures, was mine
to use those patches: I have no relation to my ISPs, they are not
going to cooperate in any sense, often not even paying lots of money,
and often they are doing things just to make it more difficult,
because they want our money, but the do not want that we actually
_use_ their installations (this is a special greating for spanish
Telefónica and spanish Terra).

If I had access to ISP cooperation, I could use a perfect line
balancing between two known machines, or much easier: just ask for one
bigger line, But the idea is just NOT to use the same ISP because they
are unreliably. Again, just let me name as an example the spanish
company Terra operating in Brazil, which has huge money and
demonstrates record beating levels of incompetence and bad faith.

> The patches Julian provided fix a bunch of nastiness. For one, dead gateway
> detection is done on the ARP level in kernelspace. Very neat when you have
> ARP, thus on ethernet, but not very useful without. Furthermore, they
> provide true alternative routes, not only multipath default routes. This is
> once more extremely neat, but not directly necessary for the usual case.

This is not strictly true, as far as I understood Julians
comments. The requirement is not to be ARP but to get some support
from the link level, which is being the case, but which could be the
case or at least could be done on other protocols as well. So if it
doesn't work with non ARP devices, the bug is not in Julians patches
but in the implementation of that protocols.

> Thirdly, Julian's patches add gateways as a routing key. This will not help
> pure routing boxes, such as would be standard issue in an office full of
> Windows toasters, as the gateway will be determined at the routing stage, so
> it cannot be used as a key.

I did think of that, and one, at least indirect, consequence is that
the gateways actually must be different. I had one case where both
lines came from the same ISP (for regional reasons of availability)
and I got two compatible IPs with the same gateway:

	IFE1 eth0	IPE1 200.201.202.28	GWE1 200.201.202.1
	IFE2 eth1	IPE2 200.201.202.29	GWE1 200.201.202.1

(only the forth digits are the actual ones)

Then I didn't know of Julians patches and wrote a daemon which would
reconfigure the network on failure or comeback of the lines. To do
that, I used to have one explicit host route to each gateway, send
pings to both and set the default route to the preferred line if both
are working, or to the working line if one failed. With a patched or
unpatched kernel, what would you do? To make it more difficult, this
was still before 2.4 kernels, using route and ifconfig only, no ip(8).

A priori, it is not obvious why this doesn't work, but it just can't
work. I gave back one line and hired another one, because once again,
the ISP didn't want to cooperate, preferring to charge uninstallation
and installation again (then some 300 US$). Not even arping broadcasts
will work. Maybe a hack with source routed packets, but I usually
disable this possibility in the kernel.

Yes, the dependency on the gateway is a limitation, but it's _much_
better than nothing. Maybe the solution would be to use more
consequently the output device, but I guess this requires
modifications which may reach the user interface. Julian's packets
didn't do that, ip(8) & Co. is still used as before.

> The *main* reason to use Julian's patches is the masquerading connection
> rerouting. This will fix the big bugs in your setup by just redirecting a
> masqueraded connection out to a different interface when the old one is
> dead.

Could you elaborate which are those bugs in my setup? I would also be
very grateful for any suggestion in fixing them.

> This is *very* cool on UDP, and will make UDP failover to another
> route fully transparent.  However, it will not fix stateful protocols in
> which the server on the other side keeps state on the IP address it was
> talking to, such as SSH. It will fix the TOS nastiness OpenSSH brings to the
> fore, as it will *reroute* after masquerading. Bit of a hack, that. I
> simply nixed the TOS bits in the firewalling code. :)

I also thought of a hack when I knew about the reroute at NAT time for
masquerade, but then, first, it works, and second this seems to have
been the way to not rewrite most of Linux networking stuff.

Also, a stateful connection really can't survive. We do not have any
support at the ISP side; the connections are 100% independent, most of
the times one ISP doesn't (shouldn't) even know of the other. There
_is_ no solution to it. If you request a download, the remote server
will start sending the packets to one particular IP. What are you
suggesting to persuade that remote server, to continue sending them to
another IP? I guess many crackers would love having such an oportunity
to take over a connection! The remote server has no chance to know if
the two IPs are in the same computer or thousands of kilometers apart.

> Summarizing: Yes, you can do equal cost multipath. Yes it is cool. Yes it
> can be made nicer and friendlier to set up using Julian's patches. However,
> it will not be an ideal solution. Things *will* break. Load will just be
> approximately balanced. Failover is in most cases definitely not transparent
> to the user: new connections have to be set up. If the links stay up though,
> equal cost multipath is a *good* thing.

No. Things do *not* break beyond having to restart certain connections
(see above). I'm using this now since more than a week with hundreds
of thousands of connections a day and it never broke. It works as
smoothe as I was much to afraid to wish since a long time. The purpose
of having more than one line is because the ISPs are unreliably. There
are three to four line failures per day as an average. Nobody noticed
ever any failover. OK, most of them are just surfing the net and not
doing prolongated downloads or ssh session. I use ssh for maintainance
and do know that a failing line can require reopening a new
session. But now I can do that, and before I had to wait a couple of
minutes until my daemon would reconfigure the network, including the
netfilter rules, and establish the new situation. I did mention the
result of load balancing in the nano howto; for now I observed a daily
disbalance of maybe some 15%, which tend to diminish on longer periods
of time.

> Oh, and it does work on >2 uplinks. I've set up a system for a client using
> 1 ISDN line, 2 ADSL links (with the Dutch MXStream cruftiness, but I
> digress) and 1 cable modem using masquerading only on the last three, using
> the standard kernel (Julian's patches didn't exist a year ago, when I did
> this). Worked splendidly (and still does, I'm told). Needed some manual
> supervision though, as link failover and especially failback is *not*
> trivial.

Well, if you did that, you did it very much in secret! Many persons
around the globe, including myself, asked on this and other lists for
an answer. I do remember a two-liner answer from you, which in that
form was useless because it just wouldn't work. A second direct
question remained without reply. Maybe actually it didn't work so
splendidly after all, because after having studied Julians patches, I
don't really see how it could be done. But then, you dispose obviously
of much more advance knowledge.

--
Christoph Simon
ciccio@xxxxxxxxxxxxxxx
---
^X^C
q
quit
:q
^C
end
x
exit
ZZ
^D
?
help
.

Re: [LARTC] Solved: Using more than 1 Internet Line

Linux Advanced Routing and Traffic Control