Re: Comments on draft-farrell-decade-ni-06

Stephen Farrell <stephen.farrell@xxxxxxxxx> · Tue, 12 Jun 2012 12:11:44 +0100

On 06/12/2012 11:56 AM, "Martin J. Dürst" wrote:
> Hello Stephen,
> 
> On 2012/06/09 22:24, Stephen Farrell wrote:
>>
>> Hi Björn,
>>
>> On 06/08/2012 03:16 AM, Bjoern Hoehrmann wrote:
>>> I think the requirement in RFC 4395 section 2.6 applies here, there are
>>> text fields in 'ni' and 'nih' addresses, so there needs to be some dis-
>>> cussion about I18N and IRI issues, or a statement that there are none,
>>> or something along those lines. What if I want a non-ASCII host name in
>>> them, for instance?
>>
>> So what's reasonable here?
>>
>> We're inheriting some definitions (authority, unreserved&
>> pct-encoded) from 3986 and I'd not like to break that, nor
>> would I like something that doesn't work with most libraries.
>>
>> Any suggestions?
> 
> I shouldn't have missed I18N and IRI issues in my apps review.
> 
> For IDNs in the authority part, you are essentially safe because RFC
> 3986 says exactly how to handle this (%-encoding based on UTF-8;
> alternatively punycode (A-Labels), for which library suppor may be
> better than for the former). You don't have to say anything about this,
> but you can say something if you think it's useful for implementers and
> if it's limited to a non-normative pointer (to the last paragraph of
> http://tools.ietf.org/html/rfc3986#section-3.2.2).

I'll leave it out unless a bunch of folks thing it better in.

> For the query part, you should put in the following text (or an
> equivalent):
> 
> For compatibility with IRIs, non-ASCII characters in the query part MUST
> be encoded as UTF-8, and the resulting octets MUST be %-encoded (see
> http://tools.ietf.org/html/rfc3986#section-2.1).

Added that,
Thanks,
S

> 
> My understanding is that there's no danger for getting non-ASCII
> characters in the path part, and that fragments are separate anyway.
> 
> Regards,   Martin.
> 
>