iptables -t mangle -A PREROUTING -j ISP2 Doesn't it need to check for state NEW? Or packets will not reach the restore-mark rule. You may have to manually populate the routing tables when an interface comes up, after being down for some time. (Kernel would have removed the routing entries for this interface after it found the interface down. This happens only if its nexthop is down) I tend to favor this approach, because it is more flexible in selecting the interface. You can use different weights/probability depending on different factors. I have seen a variation of this method, used with 'recent' (-m recent) match, instead of CONNMARK. The only downside in using this method, as far as I can see, is the need to reconfigure rules and routing tables, in case of a failure/coming-up. But lately, I have found that even with multipath method, there IS a need for reconfiguration. -----Original Message----- From: lartc-bounces@xxxxxxxxxxxxxxx [mailto:lartc-bounces@xxxxxxxxxxxxxxx] On Behalf Of Peter Rabbitson Sent: Monday, May 14, 2007 3:16 PM To: lartc@xxxxxxxxxxxxxxx Subject: Re: Multihome load balancing - kernel vs netfilter Salim S I wrote: >> -----Original Message----- >> From: lartc-bounces@xxxxxxxxxxxxxxx >> [mailto:lartc-bounces@xxxxxxxxxxxxxxx] On Behalf Of Peter Rabbitson >> Sent: Monday, May 14, 2007 1:57 PM >> To: lartc@xxxxxxxxxxxxxxx >> Subject: Multihome load balancing - kernel vs netfilter >> >> Hi, >> I have searched the archives on the topic, and it seems that the list >> gurus favor load balancing to be done in the kernel as opposed to other >> means. I have been using a home-grown approach, which splits traffic >> based on `-m statistic --mode random --probability X`, then CONNMARKs >> the individual connections and the kernel happily routes them. I >> understand that for > 2 links it will become impractical to calculate a >> correct X. But if we only have 2 gateways to the internet - are there >> any advantages in letting the kernel multipath scheduler do the >> balancing (with all the downsides of route caching), as opposed to the >> pure random approach described above? > > I have thought about this approach, but, I think, this approach does not > handle failover/dead-gateway-detection well. Because you need to alter > all your netfilter routing rules if you find a link down. And then > reconfigure again when the link comes up. I am interested to know how > you handle that. > Certainly. What I am doing is NATing a large company network, which gets load balanced and receives fail over protection. I also have a number of services running on the router which must not be balanced nor failed over, as they are expected to respond on a specific IP only. All remaining traffic on the server itself is not balanced but fails over when the designated primary link goes down. I start with a simple pinger app, that pings several well known remote sites once a minute using a large icmp packet (1k of payload). The rtt times are averaged out and are used to calculate the current "quality" of the link (the large packet makes congestion a visible factor). If one of the interface responses is 0 (meaning not a single one of the pinged hosts has responded) - the link is dead. In iproute I have two separate tables, each using one of the links as default gw, matching a certain mark. The default route is set to a single gateway (not a multipath), either by hardcoding, or by using the first input of the pinger (it can run without a default gw set, explanation follows) In iptables I have two user defined chains: iptables -t mangle -A ISP1 -j CONNMARK --set-mark 11 iptables -t mangle -A ISP1 -j MARK --set-mark 11 iptables -t mangle -A ISP1 -j ACCEPT iptables -t mangle -A ISP2 -j CONNMARK --set-mark 12 iptables -t mangle -A ISP2 -j MARK --set-mark 12 iptables -t mangle -A ISP2 -j ACCEPT The rules that reference those chains are: For all locally originating traffic: iptables -t mangle -A OUTPUT -o $I1 -j ISP1 iptables -t mangle -A OUTPUT -o $I2 -j ISP2 For all incoming traffic from the internet: iptables -t mangle -A PREROUTING -i $I1 -m state --state NEW -j ISP1 iptables -t mangle -A PREROUTING -i $I2 -m state --state NEW -j ISP2 For all other traffic (nat) iptables -t mangle -A PREROUTING -m state --state NEW -m statistic --mode random --probability $X -j ISP1 iptables -t mangle -A PREROUTING -j ISP2 At the end of the PREROUTING cain I have iptables -t mangle -A PREROUTING -j CONNMARK --restore-mark The NATing is trivially solved by: iptables -t nat -A POSTROUTING -s 10.0.58.0/24 -j SOURCE_NAT iptables -t nat -A POSTROUTING -s 192.168.58.0/24 -j SOURCE_NAT iptables -t nat -A POSTROUTING -s 192.168.8.0/24 -j SOURCE_NAT iptables -t nat -A SOURCE_NAT -o $I1 -j SNAT --to $I1_IP iptables -t nat -A SOURCE_NAT -o $I2 -j SNAT --to $I2_IP What does this achieve: * Local applications that have explicitly requested a specific IP to bind to, will be routed over the corresponding interface and will stay that way. Only applications binding to 0.0.0.0 will be routed by consulting the default route. * Responses to connections from the internet are guaranteed to leave from the same interface they came in. * All new connection not coming from the external interfaces are load balanced by the weight of $X, and are again guaranteed to stay there for the life of the connection, but another connection to the same host is not guaranteed to go over the same link. This is important in a company environment, since most employees use the same online resources. On every run of the pinger I do the following: * If both gateways are alive I replace the -m statistic rule, adjusting the value of $X * If one is detected dead, I adjust the probability accordingly (or alternatively remove the statistic match altogether), and change the default gateway if it is the one that failed. So really the whole exercise revolves around changing a single rule (or two rules, if you want to control the probability in a more fine-grained way). Last but not least this setup allowed me to program exception tables for certain IP blocks. For instance Yahoo has a braindead two tier authentication system for commercial solutions. It remembers the IP which you used to login with first, and it must match the IP used to login to a more secure area (using another password). Or users from within the lan might want to use one of the ISPs SMTP servers, which keeps a close eye on who is talking to it. So I have a $PREFERRED which is adjusted to either ISP1 or ISP2, depending on the current state of affairs, and rules like: iptables -t mangle -A PREROUTING -d 66.218.64.0/19 -m state --state NEW -j $PREFERRED iptables -t mangle -A PREROUTING -d 68.142.192.0/18 -m state --state NEW -j $PREFERRED This pretty much sums it up. The only downside I can think of is that loss of service can be observed between two runs of the pinger. Let me know if I missed something be it critical or minor. Thanks Peter _______________________________________________ LARTC mailing list LARTC@xxxxxxxxxxxxxxx http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc _______________________________________________ LARTC mailing list LARTC@xxxxxxxxxxxxxxx http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc