Re: A question about routing cache (for load balancing).

Eliezer Croitoru <eliezer@xxxxxxxxxxxx> · Fri, 08 Nov 2013 03:23:56 +0200

On 11/08/2013 02:39 AM, Humberto Jucá wrote:
You're talking about this:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=89aef8921bfbac22f00e04f8450f6e447db13e42
Exactly!!!

I will need to research about it too.

The main problem with LoadBalancing based on route cache is that it's 
not real loadbalancing but rather an exploit of the routing capabilities 
design.

A route cache is per *route*
means all traffic from this "src ip" that travels throw "this local ip" 
towards "this remote ip" will continue to be routed like this for the 
next 300 secs..
as a route cache entry states:
# ip route get 209.169.10.131
209.169.10.131 via 192.168.10.254 dev eth0  src 192.168.10.1
    cache  mtu 1500 advmss 1460 hoplimit 64
##end

So let say I am issuing 20 connections towards the same host the exact 
same gateway will be used as long as the garbage collection will not 
remove the entry.
In this time the nexthop\gateway could fall down and get up about 60++ 
times..

So from what I understood from the change in the kernel is that a 
routing system should use the FIB to calculate the right path (10Mpps is 
enough?) as expected from a dynamic routing system on the packet level 
instead of routing the packets based on a cache and a FIB lookup in a 
case needed(which means it can take two lookups one for the cache and 
second using the FIB).

In a case of LoadBalancer I would assume there is a need for iptables 
connection marking which has an option to really follow the TCP and UDP 
connections and not just routing based on cache.
In any case IPTABLES based loadbalancing of TCP level (not an 
application level) can take a bit more cycles and a bit more RAM but it 
still faster then any proxy application.

But an application that monitors the LB router and the services servers 
load constantly can change "static" routes to make sure that the load is 
distributed.
In this case there must be some connection tracking on the LB to make 
sure that the TCP connections will not just break to the clients in the 
middle of a connection which can lead to a "read error" for example.

How to handle these errors? I think it's another subject which I want to 
later on read more about.

Eliezer

--
To unsubscribe from this list: send the line "unsubscribe netfilter" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html