> From: Robert Honore <robert@digi-data.com> > I agree that the "bug" in this picture here is that nodes can have > multiple addresses, some of which work and some of which don't work in > different circumstances. It really should be that an address which is > advertised for a node should be valid for communication with that node > under all circumstances. If by "address" you mean "the names the routing system uses to keep track of where things are", this contention of yours is not really feasible. What you're basically saying is "the routing system has to be able to get to *any* address, no matter what failures happen", and that's not so simple. The simplest approach to doing so would basically mean injecting a route for every separate physical network into the global routing table - because depending on exactly which component or components failed (thereby making the network unreachable as part of its original addressing aggregate), the alternative path to it might lead almost anywhere in the network - hence the requirement to circulate knowledge of that destination globally. (Actually, it's even worse than "every separate physical network", because if you have a physical network which can be broken into two separate working parts, each part would need a separate, globally circulating routing table entry.) The obvious second thought is "Oh, but we don't need to put these things in the routing table - when an addressing aggregate partitions, we can construct a tunnel to the 'lost' part." The problem with this idea is "how do you find where the lost part is, and how to get to it"? The simplest answer is - you guessed it - to inject that destination into the routing. Again you have two choices - i) do it once failure happens, which first means you have to detect such failures, and which also means that until the global routing stabilizes, that destination is unreachable (as well as a host of other problems I touch on below), or ii) you do it in advance, which is the preceeding solution. So an alternative to that would instead be for the partitioned section to try and make contact with the large aggregate it was partitioned from - after signing up with some local "care of" agent, one which does still have a "working" address and will act on its behalf, and tunnelling its traffic through that entity. But now you get into all sorts of wonderfully complex failure modes (as I alluded to above). E.g. how do you decide which of the two parts is the "main" part of the partitioned addressing entity, i.e. the one that gets to keep advertising the entire aggregate? And what do you do if the "main" part of the aggregate was completely disconnected from the network, so it can't act on behalf of the orphaned piece? And how do you tell the difference between an aggregate piece which has been disconnected by a failure, and an aggregate which is a "hole" punched when someone moved and took their addresses with them (which should be advertised normally, globally - although no doubt some would say that having to use a tunnel through a "care of" agent would be condign treatment for such people :-)? And I won't even get into the issue of how to make this all robust and secure; I can think of the most wonderful attacks in which I uses these mechanisms to divert traffic. So all of sudden the "advertise all the pieces all the time" looks a lot simpler - until you realize that it won't scale, and is impossible. So it's nice to say "an address .. should be valid for communication .. under all circumstances", but it you actually try and think about how you'd make that work, it's not so simple. Noel