Re: where the indirection layer belongs

"J. Noel Chiappa" <jnc@ginger.lcs.mit.edu> · Tue, 2 Sep 2003 11:37:55 -0400

    > From: Robert Honore <robert@digi-data.com>

    > I agree that the "bug" in this picture here is that nodes can have
    > multiple addresses, some of which work and some of which don't work in
    > different circumstances. It really should be that an address which is
    > advertised for a node should be valid for communication with that node
    > under all circumstances.

If by "address" you mean "the names the routing system uses to keep track of
where things are", this contention of yours is not really feasible. What
you're basically saying is "the routing system has to be able to get to *any*
address, no matter what failures happen", and that's not so simple.

The simplest approach to doing so would basically mean injecting a route for
every separate physical network into the global routing table - because
depending on exactly which component or components failed (thereby making the
network unreachable as part of its original addressing aggregate), the
alternative path to it might lead almost anywhere in the network - hence the
requirement to circulate knowledge of that destination globally.

(Actually, it's even worse than "every separate physical network", because if
you have a physical network which can be broken into two separate working
parts, each part would need a separate, globally circulating routing table
entry.)

The obvious second thought is "Oh, but we don't need to put these things in
the routing table - when an addressing aggregate partitions, we can construct
a tunnel to the 'lost' part." The problem with this idea is "how do you find
where the lost part is, and how to get to it"? The simplest answer is - you
guessed it - to inject that destination into the routing.

Again you have two choices - i) do it once failure happens, which first means
you have to detect such failures, and which also means that until the global
routing stabilizes, that destination is unreachable (as well as a host of
other problems I touch on below), or ii) you do it in advance, which is the
preceeding solution.

So an alternative to that would instead be for the partitioned section to try
and make contact with the large aggregate it was partitioned from - after
signing up with some local "care of" agent, one which does still have a
"working" address and will act on its behalf, and tunnelling its traffic
through that entity.

But now you get into all sorts of wonderfully complex failure modes (as I
alluded to above). E.g. how do you decide which of the two parts is the "main"
part of the partitioned addressing entity, i.e. the one that gets to keep
advertising the entire aggregate? And what do you do if the "main" part of the
aggregate was completely disconnected from the network, so it can't act on
behalf of the orphaned piece? And how do you tell the difference between an
aggregate piece which has been disconnected by a failure, and an aggregate
which is a "hole" punched when someone moved and took their addresses with
them (which should be advertised normally, globally - although no doubt some
would say that having to use a tunnel through a "care of" agent would be
condign treatment for such people :-)?

And I won't even get into the issue of how to make this all robust and
secure; I can think of the most wonderful attacks in which I uses these
mechanisms to divert traffic.

So all of sudden the "advertise all the pieces all the time" looks a lot
simpler - until you realize that it won't scale, and is impossible.

So it's nice to say "an address .. should be valid for communication .. under
all circumstances", but it you actually try and think about how you'd make
that work, it's not so simple.

	Noel