Bernard, Thanks for the comments. Let me see if I can describe a scenario in which behavior-discovery is useful. First, we don't want to "go back to 3489." There were two problems (well, there were a lot more problems, but I just want to talk about two right now) in particular that we don't ever want to go back to: - 3489 specified that an application would start up, characterize its NAT, and work in that mode forever after - 3489 specified that if you had a friendly NAT, you could query the STUN server for your transport address and use that one address At the same time, behavior-discovery is targeting applications for which ICE doesn't necessarily make sense. For example, applications that don't want to fall back to TURN, but have other options for how to establish a connection. (whether this means indirect routing or not needing the connection, or other reasons) So let me try to go into more details on a potential P2P application. When P2P node A starts up, it evaluates its NAT(s) relative to other nodes already in the overlay. Let's say that its testing indicates it's behind a good NAT, with endpoint-independent mapping and filtering. In this case, the peer will join the overlay and establish connections with appropriate peers in the overlay, but it will advertise to any node in the overlay that wants to reach it that they don't need to route through the overlay network formed by the P2P nodes to reach it (which is the normal routing mode in a P2P overlay), they can just send directly to its IP address. So when node B wants to send a message to A, it sends the message directly to A's IP address and starts a timer. If it doesn't receive a response within a certain amount of time, then it routes the message to A across the overlay instead. (Alternatively, B could simultaneously send the message to A's IP address and across the overlay, which guarantees minimum response latency, but can waste bandwidth.) A over time observes what percentage of the time it receives direct messages compared to overlay messages. If the percentage of direct connections is below some threshold (say 66%, picking a random number) then may stop advertising for direct connections. But if the percentage is high enough, it continues to advertise because it may be helping performance. If at some point, the NAT changes its behavior, A will notice a change in its direct connection percentage and may re-evaluate its decision to advertise a public address. (There are a lot of other details how this might work, how it would deal with multiple levels of NATs, and what the actual cost benefits are. I don't want to get into all of the details of how it would work here.) This is a good example because behavior-discovery is used for initial operating mode selection, but the actual decision for whether to continue advertising that public IP/port pair is made based on actual operating data. It's also using the result of the behavior-discovery work as an optimization, not in a manner where the application will fail if a percentage of the nodes in the overlay are unable to make a connection. Bruce On Sat, Apr 4, 2009 at 2:39 AM, Bernard Aboba <bernard_aboba@xxxxxxxxxxx> wrote: > Bruce Lowekamp said: > > "Many of the questions you raise point to the same question of whether > tests or techniques that are known to fail on a certain percentage of > NATs under a certain percentage of operating conditions are > nevertheless valuable. behavior-discovery has an applicability > statement > http://tools.ietf.org/html/draft-ietf-behave-nat-behavior-discovery-06#section-1 > that discusses those issues in some detail. I spent enough time > wording that statement and discussing it with various people that I > think it is best to refer to that statement. > > You also repeatedly uses phrases such as "basically won't work" and > "it might work." The comes down to the value of "certain percentage" > as used above. My experience with these techniques, and the > experience of those who have used such techniques recently, is that > they are far more reliable than that, into the 90% range, particularly > when used correctly. That is not high enough that we could go back to > 3489---all techniques require fallbacks because they fail, and 90% is > far, far too low of a success rate---but it is high enough that > applications can make useful decisions based on that information, > provided they have a fallback in cases where the information is wrong. > And those are the conditions of the experiment." > > What I am failing to understand is the distinction between those > situations in which we "cannot go back to RFC 3489" and the scenarios > envisaged for the experiment. > > Presumably, situations in which we "cannot go back to RFC 3489" > include Internet telephony, which may be used for life-critical > situations such as E911. For those kind of scenarios, we need > traversal technologies that are as reliable as possible, and are > willing to live with the complexity of ICE to achieve this. > > The draft mentions P2P applications as one potential situation in > which usage of imperfect techniques is acceptable, and yet the > IETF currently has the P2PSIP WG, which is involved in the > development of technology for usage of SIP over P2P networks. > In that kind of application, wouldn't the reliability requirements > be similar to those in which we "cannot go back to RFC 3489"? > > This lead me to think about the requirements for the diagnostic > scenarios that are also discussed in the document. In existing > deployments it is often challenging to figure out the reasons > why traversal is unsuccessful, and what can be done to improve > the overall success rate. Data suggests that there are even > common situations in which ICE will fail. But in thinking > through how to approach diagnosis under those conditions, > I'd currently be more inclined to start from the addition of > diagnostics to an ICE implementation than to focus on the > use of the diagnostic mechanisms described in the draft. > > So while I'm generally sympathetic to the idea that there > are situations in which "less than perfect" techniques can > be useful, in practice a number of common situations > where NAT traversal is used today (such as life-critical > Internet telephony) do not seem to fit into that bucket. > > It could be that I didn't quite understand the examples > given in the applicability statement, or that I'm putting > too much emphasis on corner conditions, because that is > what customers tend to complain about. > > However, overall the document left me unclear about the > rationale by which the material deprecated in RFC 3489 > was being re-introduced. While it does seem possible > to construct a rationale for this, the document doesn't > provide enough background to get me over that hump. > > > > > > > _______________________________________________ > Ietf mailing list > Ietf@xxxxxxxx > https://www.ietf.org/mailman/listinfo/ietf > > _______________________________________________ Ietf@xxxxxxxx https://www.ietf.org/mailman/listinfo/ietf