On Tue, 4 Dec 2001 09:52:49 +0100 (MET) Arthur van Leeuwen <arthurvl@xxxxxxxxxx> wrote: > Okay, I've read both the nanohowto and the docs on Julian's patches by now. > A few things to note: the nanohowto's information is good even without > Julian's patches, although things will become trickier. One has to do ones > own link-probing and rerouting from userland. That is very doable however, > provided you have machines somewhere at the ISP's site that will answer to > either pings ore traceroutes or somesuch, as you will need answers. I'm not sure which is the situation Julian wanted to address with the patches, but having more than one line and using both of them at the same time, while being able to continue in case of failures, was mine to use those patches: I have no relation to my ISPs, they are not going to cooperate in any sense, often not even paying lots of money, and often they are doing things just to make it more difficult, because they want our money, but the do not want that we actually _use_ their installations (this is a special greating for spanish Telefónica and spanish Terra). If I had access to ISP cooperation, I could use a perfect line balancing between two known machines, or much easier: just ask for one bigger line, But the idea is just NOT to use the same ISP because they are unreliably. Again, just let me name as an example the spanish company Terra operating in Brazil, which has huge money and demonstrates record beating levels of incompetence and bad faith. > The patches Julian provided fix a bunch of nastiness. For one, dead gateway > detection is done on the ARP level in kernelspace. Very neat when you have > ARP, thus on ethernet, but not very useful without. Furthermore, they > provide true alternative routes, not only multipath default routes. This is > once more extremely neat, but not directly necessary for the usual case. This is not strictly true, as far as I understood Julians comments. The requirement is not to be ARP but to get some support from the link level, which is being the case, but which could be the case or at least could be done on other protocols as well. So if it doesn't work with non ARP devices, the bug is not in Julians patches but in the implementation of that protocols. > Thirdly, Julian's patches add gateways as a routing key. This will not help > pure routing boxes, such as would be standard issue in an office full of > Windows toasters, as the gateway will be determined at the routing stage, so > it cannot be used as a key. I did think of that, and one, at least indirect, consequence is that the gateways actually must be different. I had one case where both lines came from the same ISP (for regional reasons of availability) and I got two compatible IPs with the same gateway: IFE1 eth0 IPE1 200.201.202.28 GWE1 200.201.202.1 IFE2 eth1 IPE2 200.201.202.29 GWE1 200.201.202.1 (only the forth digits are the actual ones) Then I didn't know of Julians patches and wrote a daemon which would reconfigure the network on failure or comeback of the lines. To do that, I used to have one explicit host route to each gateway, send pings to both and set the default route to the preferred line if both are working, or to the working line if one failed. With a patched or unpatched kernel, what would you do? To make it more difficult, this was still before 2.4 kernels, using route and ifconfig only, no ip(8). A priori, it is not obvious why this doesn't work, but it just can't work. I gave back one line and hired another one, because once again, the ISP didn't want to cooperate, preferring to charge uninstallation and installation again (then some 300 US$). Not even arping broadcasts will work. Maybe a hack with source routed packets, but I usually disable this possibility in the kernel. Yes, the dependency on the gateway is a limitation, but it's _much_ better than nothing. Maybe the solution would be to use more consequently the output device, but I guess this requires modifications which may reach the user interface. Julian's packets didn't do that, ip(8) & Co. is still used as before. > The *main* reason to use Julian's patches is the masquerading connection > rerouting. This will fix the big bugs in your setup by just redirecting a > masqueraded connection out to a different interface when the old one is > dead. Could you elaborate which are those bugs in my setup? I would also be very grateful for any suggestion in fixing them. > This is *very* cool on UDP, and will make UDP failover to another > route fully transparent. However, it will not fix stateful protocols in > which the server on the other side keeps state on the IP address it was > talking to, such as SSH. It will fix the TOS nastiness OpenSSH brings to the > fore, as it will *reroute* after masquerading. Bit of a hack, that. I > simply nixed the TOS bits in the firewalling code. :) I also thought of a hack when I knew about the reroute at NAT time for masquerade, but then, first, it works, and second this seems to have been the way to not rewrite most of Linux networking stuff. Also, a stateful connection really can't survive. We do not have any support at the ISP side; the connections are 100% independent, most of the times one ISP doesn't (shouldn't) even know of the other. There _is_ no solution to it. If you request a download, the remote server will start sending the packets to one particular IP. What are you suggesting to persuade that remote server, to continue sending them to another IP? I guess many crackers would love having such an oportunity to take over a connection! The remote server has no chance to know if the two IPs are in the same computer or thousands of kilometers apart. > Summarizing: Yes, you can do equal cost multipath. Yes it is cool. Yes it > can be made nicer and friendlier to set up using Julian's patches. However, > it will not be an ideal solution. Things *will* break. Load will just be > approximately balanced. Failover is in most cases definitely not transparent > to the user: new connections have to be set up. If the links stay up though, > equal cost multipath is a *good* thing. No. Things do *not* break beyond having to restart certain connections (see above). I'm using this now since more than a week with hundreds of thousands of connections a day and it never broke. It works as smoothe as I was much to afraid to wish since a long time. The purpose of having more than one line is because the ISPs are unreliably. There are three to four line failures per day as an average. Nobody noticed ever any failover. OK, most of them are just surfing the net and not doing prolongated downloads or ssh session. I use ssh for maintainance and do know that a failing line can require reopening a new session. But now I can do that, and before I had to wait a couple of minutes until my daemon would reconfigure the network, including the netfilter rules, and establish the new situation. I did mention the result of load balancing in the nano howto; for now I observed a daily disbalance of maybe some 15%, which tend to diminish on longer periods of time. > Oh, and it does work on >2 uplinks. I've set up a system for a client using > 1 ISDN line, 2 ADSL links (with the Dutch MXStream cruftiness, but I > digress) and 1 cable modem using masquerading only on the last three, using > the standard kernel (Julian's patches didn't exist a year ago, when I did > this). Worked splendidly (and still does, I'm told). Needed some manual > supervision though, as link failover and especially failback is *not* > trivial. Well, if you did that, you did it very much in secret! Many persons around the globe, including myself, asked on this and other lists for an answer. I do remember a two-liner answer from you, which in that form was useless because it just wouldn't work. A second direct question remained without reply. Maybe actually it didn't work so splendidly after all, because after having studied Julians patches, I don't really see how it could be done. But then, you dispose obviously of much more advance knowledge. -- Christoph Simon ciccio@xxxxxxxxxxxxxxx --- ^X^C q quit :q ^C end x exit ZZ ^D ? help .