Re: Route fallback issue

Linux Advanced Routing and Traffic Control

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/25/2018 12:50 PM, Julian Anastasov wrote:
Hello,

Hi Julian,

Yes, ARP state for unreachable GWs may be updated slowly, there is in-time feedback only for reachable state.

Fair.

Most of the installations where I needed D.G.D. to work would be okay with a < 5 minute timeout. Obviously they would like faster, but automation is a LOT better than waiting on manual intervention.

IMHO < 30 seconds is great. < 90 seconds is acceptable. < 300 seconds leaves some room for improvement.

You can create the two routes, of course. But only the default routes are alternative.

Are you saying that the functionality I'm describing only works for default gateways or that the term "alternative route" only applies to default gateways?

The testing that I did indicated that alternative routes worked for specific prefixes too.

I tested multiple NetNSs with only directly attached routes and appended routes to a destination prefix, no default gateway / route of last resort.

The behavior seemed to be different when ignore_routes_with_linkdown was set verses unset. Specifically, ignore_routes_with_linkdown seemed to help considerably.

Hence why I question the requirement for the "default" route verses a route to a specific prefix.

Can you explain why I saw the behavior difference with ignore_routes_with_linkdown if it only applies to the default route?

The alternative routes work in this way:

- on lookup, routes are walked in order - as listed in table

- as long as route contains reachable gateway (ARP state), only this route is used

- if some gateway becomes unreachable (ARP state), next alternative routes are tried

- if ARP entry is expired (missing), this gateway can be probed if the route is before the currently used route. This is what happens initially when no ARP state is present for the GWs. It is bad luck if the probed GW is actually unreachable.

- active probing by user space (ping GWs) can only help to keep the ARP state present for the used gateways. By this way, if ARP entry for GW is missing, the kernel will not risk to select unavailable route with the goal to probe the GW.

This all makes sense.

Please confirm if "gateway" in this context is the "/default/ gateway" or not. I ask because arguably "gateway" can be used as a term to describe the next hop for a route, or gateway, to a prefix. Further, the "/default/ (gateway,router)" is the gateway or route of last resort. Which to me means that "gateway" can be any route in this context.

nexthop is the GW in the route

Thank you for confirming.

Yes, the kernel avoids alternative routes with unreachable GWs

Fair enough.

The multipath route uses all its alive nexthops at the same time... But you may need in the same way active probing by user space, otherwise unavailable GW can be selected.

I assume that the dead ECMP NEXTHOP is also subject to similar timeouts as alternative routes. Correct?

Yes, if you prefer, you may run PING every second to avoid such delays...

Agreed.

I'm trying to make sure I understand basic functionality before I do things to modify it.



--
Grant. . . .
unix || die

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


[Index of Archives]     [LARTC Home Page]     [Netfilter]     [Netfilter Development]     [Network Development]     [Bugtraq]     [GCC Help]     [Yosemite News]     [Linux Kernel]     [Fedora Users]
  Powered by Linux