I'm rather reticent to add real technical discussion to the issue of list mismangement. On Tue, 27 Sep 2005, Bill Sommerfeld wrote: > On Tue, 2005-09-27 at 10:06, Robert Elz wrote: > > Date: Mon, 26 Sep 2005 15:41:56 -0400 (EDT) > > From: Dean Anderson <dean@xxxxxxx> > > Message-ID: <Pine.LNX.4.44.0509261531270.32513-100000@xxxxxxxxxxxxxx> > > > > | It is not DNSSEC that is broken. > > > > I have not been following dnsop discussions, but from this summary, there > > is nothing broken beyond your understanding of what is happening. > > It's worse. The reasoning is broken on other points, as well. > > In these arguments, RFC 1812 has been cited repeatedly as a > specification for load-splitting. By my reading, 1812 is extremely > vague about the topic, and does not require a specific spreading > algorithm. Yes. It gives the implementor tremendous lattitude. But plainly, it is appropriate to do (as Cisco did), per packet load balancing, where successive packets can be expected to take different paths. > Its strongest recommendation is that there be a way to turn > it off if it doesn't work for you, which should by itself be a clue that > load-spreading should be used with caution; it also cautions that that > load-splitting was an area of active research at the time 1812 was > published. And now there are implementations and users that use it. But to make anycast work with TCP or large UDP and fragments, one needs to guarantee that two successive packets (actually an entire session) uses exactly the same path. No load balancing (or very course grained load balancing) is required. The prescription given in RFC1546 needs to be changed: RFC1546 page 5: --------------------------------------------------- How UDP and TCP Use Anycasting It is important to remember that anycasting is a stateless service. An internetwork has no obligation to deliver two successive packets sent to the same anycast address to the same host. --------------------------------------------------- RFC1546 also gives a prescription for alterations to TCP so that TCP can work with Anycast and with the condition on successive packets above. So far as I know, no one has implemenated this prescription in a TCP stack. > Moreover, load-splitting which results in the sort of flow-shredding > which would disrupt multi-packet anycast exchanges also causes > significant difficulties for unicast. To quote from rfc2991 section 2: RFC2991 is a Informational, and is wrong in some of its assertions. This was discussed on the GROW list. > Variable Path MTU > Since each of the redundant paths may have a different MTU, > this means that the overall path MTU can change on a packet- > by-packet basis, negating the usefulness of path MTU discovery. This is not a real problem. The MTU is reduced to the smallest MTU of any path. If PMTUD is turned off (an option rarely used) the DF bit is also turned off and so packets will be fragmented. While the smaller packet size might be sub-optimal on the larger MTU paths, this is just a (tiny) performance consideration. It is not the case that the usefulness of path MTU is negated. > Variable Latencies > Since each of the redundant paths may have a different latency > involved, having packets take separate paths can cause packets > to always arrive out of order, increasing delivery latency and > buffering requirements. > > Packet reordering causes TCP to believe that loss has taken > place when packets with higher sequence numbers arrive before > an earlier one. When three or more packets are received before > a "late" packet, TCP enters a mode called "fast-retransmit" [6] > which consumes extra bandwidth (which could potentially cause > more loss, decreasing throughput) as it attempts to > unnecessarily retransmit the delayed packet(s). Hence, > reordering can be detrimental to network performance. RFC2991 also mis-states the TCP issue. RFC2581 describes the Fast retransmit behavior as follows: "The TCP sender SHOULD use the "fast retransmit" algorithm to detect and repair loss, based on incoming duplicate ACKs. The fast retransmit algorithm uses the arrival of 3 duplicate ACKs (4 identical ACKs without the arrival of any other intervening packets) as an indication that a segment has been lost. After receiving 3 duplicate ACKs, TCP performs a retransmission of what appears to be the missing segment, without waiting for the retransmission timer to expire. RFC2991 mis-states this as follows: When three or more packets are received before a "late" packet, TCP enters a mode called "fast-retransmit" This is not the case. [However, if it were the case, it would still only affect 6% of the packets.] A fast retransmit is made after 4 idential ack packets are received, which means that 4 packets have to be received before the late packet. A more thorough reading of RFC2581 reveals when an ACK should be sent: A TCP receiver SHOULD send an immediate duplicate ACK when an out- of-order segment arrives. The purpose of this ACK is to inform the sender that a segment was received out-of-order and which sequence number is expected. From the sender's perspective, duplicate ACKs can be caused by a number of network problems. First, they can be caused by dropped segments. In this case, all segments after the dropped segment will trigger duplicate ACKs. Second, duplicate ACKs can be caused by the re-ordering of data segments by the network (not a rare event along some network paths [Pax97]). While out-of-order packets could trigger the fast retransmit, it occurs just 3% of the time. So just 3% of packets are unnecessarilly retransmitted. Not a great performance impact. But again, at worst, this is merely a performance issue that may be more than compensated for by the additional performance and availability of multiple diverse links. But lets not forget the benefits of load balancing over diverse paths: For example, when a path fails, it can be immediately removed from the routers FIB, and another path can be immediately used without waiting for routing processes to select the next best route and add it to the FIB. [no more blackholes until next BGP scan after link failure]. While little benefit to SMTP, This greatly benefits VOIP and streaming audio and video. VOIP RTP buffers have no such performance issues with multipath. As long as each packet arrives before it is to be consumed, it does not matter what order they arrive in. PPLB would greatly improve VOIP performance characteristics. > And folks I know who build gear which does load-splitting seem to be > scrupulously careful to avoid these sorts of problems. The equipment cannot do anything to avoid these problems. Except turn off load balancing if necessary. --Dean -- Av8 Internet Prepared to pay a premium for better service? www.av8.net faster, more reliable, better service 617 344 9000 _______________________________________________ Ietf@xxxxxxxx https://www1.ietf.org/mailman/listinfo/ietf