RE: Last Call: draft-ietf-behave-nat-behavior-discovery (NATBehavior Discovery Using STUN) to Experimental RFC

Bernard Aboba <bernard_aboba@xxxxxxxxxxx> · Mon, 6 Apr 2009 09:10:22 -0700

Bruce --

Thanks for the reply.  Your explanation provides some helpful background. 
Would you consider adding some of this material to the document? 

> Date: Sun, 5 Apr 2009 22:57:22 -0400
> Subject: Re: Last Call: draft-ietf-behave-nat-behavior-discovery (NATBehavior 	Discovery Using STUN) to Experimental RFC
> From: bbl@xxxxxxxxxxxx
> To: bernard_aboba@xxxxxxxxxxx
> CC: ietf@xxxxxxxx; behave@xxxxxxxx
> 
> Bernard,
> 
> Thanks for the comments.  Let me see if I can describe a scenario in
> which behavior-discovery is useful.
> 
> First, we don't want to "go back to 3489."  There were two problems
> (well, there were a lot more problems, but I just want to talk about
> two right now) in particular that we don't ever want to go back to:
> 
> - 3489 specified that an application would start up, characterize its
> NAT, and work in that mode forever after
> - 3489 specified that if you had a friendly NAT, you could query the
> STUN server for your transport address and use that one address
> 
> At the same time, behavior-discovery is targeting applications for
> which ICE doesn't necessarily make sense.  For example, applications
> that don't want to fall back to TURN, but have other options for how
> to establish a connection.   (whether this means indirect routing or
> not needing the connection, or other reasons)
> 
> So let me try to go into more details on a potential P2P application.
> When P2P node A starts up, it evaluates its NAT(s) relative to other
> nodes already in the overlay.  Let's say that its testing indicates
> it's behind a good NAT, with endpoint-independent mapping and
> filtering.  In this case, the peer will join the overlay and establish
> connections with appropriate peers in the overlay, but it will
> advertise to any node in the overlay that wants to reach it that they
> don't need to route through the overlay network formed by the P2P
> nodes to reach it (which is the normal routing mode in a P2P overlay),
> they can just send directly to its IP address.
> 
> So when node B wants to send a message to A, it sends the message
> directly to A's IP address and starts a timer.  If it doesn't receive
> a response within a certain amount of time, then it routes the message
> to A across the overlay instead.  (Alternatively, B could
> simultaneously send the message to A's IP address and across the
> overlay, which guarantees minimum response latency, but can waste
> bandwidth.)
> 
> A over time observes what percentage of the time it receives direct
> messages compared to overlay messages. If the percentage of direct
> connections is below some threshold (say 66%, picking a random number)
> then may stop advertising for direct connections.  But if the
> percentage is high enough, it continues to advertise because it may be
> helping performance.  If at some point, the NAT changes its behavior,
> A will notice a change in its direct connection percentage and may
> re-evaluate its decision to advertise a public address.
> 
> 
> (There are a lot of other details how this might work, how it would
> deal with multiple levels of NATs, and what the actual cost benefits
> are.  I don't want to get into all of the details of how it would work
> here.)
> 
> This is a good example because behavior-discovery is used for initial
> operating mode selection, but the actual decision for whether to
> continue advertising that public IP/port pair is made based on actual
> operating data.  It's also using the result of the behavior-discovery
> work as an optimization, not in a manner where the application will
> fail if a percentage of the nodes in the overlay are unable to make a
> connection.
> 
> Bruce
> 
> 
> On Sat, Apr 4, 2009 at 2:39 AM, Bernard Aboba <bernard_aboba@xxxxxxxxxxx> wrote:
> > Bruce Lowekamp said:
> >
> > "Many of the questions you raise point to the same question of whether
> > tests or techniques that are known to fail on a certain percentage of
> > NATs under a certain percentage of operating conditions are
> > nevertheless valuable.  behavior-discovery has an applicability
> > statement
> > http://tools.ietf.org/html/draft-ietf-behave-nat-behavior-discovery-06#section-1
> > that discusses those issues in some detail.  I spent enough time
> > wording that statement and discussing it with various people that I
> > think it is best to refer to that statement.
> >
> > You also repeatedly uses phrases such as "basically won't work" and
> > "it might work."   The comes down to the value of "certain percentage"
> > as used above.  My experience with these techniques, and the
> > experience of those who have used such techniques recently, is that
> > they are far more reliable than that, into the 90% range, particularly
> > when used correctly.  That is not high enough that we could go back to
> > 3489---all techniques require fallbacks because they fail, and 90% is
> > far, far too low of a success rate---but it is high enough that
> > applications can make useful decisions based on that information,
> > provided they have a fallback in cases where the information is wrong.
> > And those are the conditions of the experiment."
> >
> > What I am failing to understand is the distinction between those
> > situations in which we "cannot go back to RFC 3489" and the scenarios
> > envisaged for the experiment.
> >
> > Presumably, situations in which we "cannot go back to RFC 3489"
> > include Internet telephony, which may be used for life-critical
> > situations such as E911.  For those kind of scenarios, we need
> > traversal technologies that are as reliable as possible, and are
> > willing to live with the complexity of ICE to achieve this.
> >
> > The draft mentions P2P applications as one potential situation in
> > which usage of imperfect techniques is acceptable, and yet the
> > IETF currently has the P2PSIP WG, which is involved in the
> > development of technology for usage of SIP over P2P networks.
> > In that kind of application, wouldn't the reliability requirements
> > be similar to those in which we "cannot go back to RFC 3489"?
> >
> > This lead me to think about the requirements for the diagnostic
> > scenarios that are also discussed in the document.  In existing
> > deployments it is often challenging to figure out the reasons
> > why traversal is unsuccessful, and what can be done to improve
> > the overall success rate.  Data suggests that there are even
> > common situations in which ICE will fail.  But in thinking
> > through how to approach diagnosis under those conditions,
> > I'd currently be more inclined to start from the addition of
> > diagnostics to an ICE implementation than to focus on the
> > use of the diagnostic mechanisms described in the draft.
> >
> > So while I'm generally sympathetic to the idea that there
> > are situations in which "less than perfect" techniques can
> > be useful, in practice a number of common situations
> > where NAT traversal is used today (such as life-critical
> > Internet telephony) do not seem to fit into that bucket.
> >
> > It could be that I didn't quite understand the examples
> > given in the applicability statement, or that I'm putting
> > too much emphasis on corner conditions, because that is
> > what customers tend to complain about.
> >
> > However, overall the document left me unclear about the
> > rationale by which the material deprecated in RFC 3489
> > was being re-introduced.   While it does seem possible
> > to construct a rationale for this, the document doesn't
> > provide enough background to get me over that hump.
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > Ietf mailing list
> > Ietf@xxxxxxxx
> > https://www.ietf.org/mailman/listinfo/ietf
> >
> >

_______________________________________________

Ietf@xxxxxxxx
https://www.ietf.org/mailman/listinfo/ietf