On Tuesday 06 January 2004 02:25 am, Gordan Bobic wrote: > If one of the default routes is removed, everything works OK. However, if > there are two default routes, packets get misdirected. ChangeLog for 2.4.21 > lists a few conntrack bug fixes, which I suspect to be the cause of this. > Basically, the non-deterministic default route selection/rotation seems to > take precedence over maintaining the same interface for serving a > particular established connection through the firewall. You are right. This is because the routing core is often queried more than once to set up a usable route cache entry for a given connection/session. Have a look at ip_route_connect() in linux/include/net/route.h as an example: static inline int ip_route_connect(struct rtable **rp, u32 dst, u32 src, u32 tos, int oif) { int err; err = ip_route_output(rp, dst, src, tos, oif); if (err || (dst && src)) return err; dst = (*rp)->rt_dst; src = (*rp)->rt_src; ip_rt_put(*rp); *rp = NULL; return ip_route_output(rp, dst, src, tos, oif); } Consider when this function is called with src==0, which happens for locally generated output (SNAT is similar I believe). The first ip_route_output() call returns a pointer to a route cache entry, which includes a src ip in (*rp)->rt_src. The first route cache entry doesn't work for us, because its 'key' has src==0 and so won't match subsequent traffic. So a second ip_route_output() is called using the new src as part of its key. The new key matches no existing route cache and as a result the default multipath route is again consulted and a nexthop is determined. This latter process does not use src in its processing so there is no guarantee that the nexthop returned is the same as that returned by the first query. Hence, src ip is not guaranteed to match outbound interface. Julian Anastasov's patches, noted earlier in this thread, provide a solution to this problem. He allows for additional route rules and route tables that are matched by the second route query in preference to the default route so the src ip and outbound interface can be forced to be consistent. I'm still pretty new to all this, so I hope Julian or someone else can correct any errors I have made. The example above is in the non-NAT case of locally generated traffic, but I believe it's representative of what happens in the SNAT case as well. > I'm compiling a new clean 2.4.24 with the jumbo routes patch at the moment, > which will hopefully fix things. I'm hoping to try it out tonight. And BTW, > the latest RH9 kernel released yesterday (2.4.20-28.9 IIRC), is still > broken as far as routing is concerned. I haven't looked at RedHat's route patch; it'd be killer if they solved this without requiring the additional route rules and tables setups as required by Julian's patches. Let us know the outcome, would you? The reason for this behavior makes sense from a code perspective, but not IMO from a route administration perspective. I have a patch in its infancy that attempts to address this problem without requiring extra route administration (rules and tables). It works in the non-nat case, but there is still much more testing to go before it's worth publishing. If it survives the next few weeks of testing, I'd be happy to pass it on to anyone else who might be interested in playing with it. Best Regards, Steve _______________________________________________ LARTC mailing list / LARTC@xxxxxxxxxxxxxxx http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/