Re: Last Call: <draft-faltstrom-5892bis-04.txt> (The Unicode code points and IDNA - Unicode 6.0) to Proposed Standard

Simon Josefsson <simon@xxxxxxxxxxxxx> · Sun, 29 May 2011 20:29:34 +0200

John C Klensin <john-ietf@xxxxxxx> writes:

> --On Sunday, May 29, 2011 08:58 +0200 Simon Josefsson
> <simon@xxxxxxxxxxxxx> wrote:
>
>>> in a Unicode 6.0 environment, evaluate U+19DA as PVALID and
>>> therefore not raise that error, then it is not "compliant"
>>> with RFC 5892, irrelevant of the "Updates" status of the
>>> present document.
>> 
>> I don't see how.
>> 
>> My code uses the tables from RFC 5892 which were generated in
>> an Unicode 5.2 environment.  My IDNA2008 code may eventually
>> run in an Unicode 6.0 environment, or any other future version
>> of Unicode.  I can't control the Unicode version used, and
>> from what I understand this is one of the features of
>> IDNA2008.  Implementations need not lock down the Unicode
>> version to a single Unicode version, as they had to do for
>> IDNA2003.
>
> It seems to me that this is exactly where we are having a
> misunderstanding.   In terms of determining conformance, those
> tables are not normative, so it is not possible to say "I
> implemented the tables in RFC 5892 and therefore I conform to
> the standard".  The closest you can get would be to say "I
> implemented the rules and tested against the tables when those
> rules were applied to Unicode 5.2 and therefore have great
> confidence in my implementaton", but conformance statements stop
> with "implemented the rules correctly".  
>
> For practical reasons, we expect to see production
> implementations using tables or other abstractions of the rules
> that are somewhat pre-compiled, not applying the rule set each
> time.   One consequence of this is that a given table-based
> implementation is inevitably dependent on versions of Unicode
> even if the Standard (and its conformance requirements) is not.

Right, and that describes my implementation.  There is no difference in
behaviour of an implementation that uses the informative tables in RFC
5892 directly or one that pre-computes the table at compile time using
Unicode 5.2.  The data and output are the same in both cases.  So I
don't follow where you think the misunderstanding is?  I agree with what
you say here.

>> If this model is not permitted, I believe there are bigger
>> problems.
>> 
>> To avoid doubt, and to back up your assertment that my
>> implementation is non-compliant, please point to the "MUST" or
>> "SHOULD" in RFC 5892 that forbis this, to me, logical
>> implementation approach.
>
> The key is the text in Section 4 that says:
>
> 	"The table in Appendix B shows, for illustrative
> 	purposes, the consequences of the categories and
> 	classification rules, and the resulting property values.
> 	
> 	"The list of code points that can be found in Appendix B
> 	is non-normative.  Sections 2 and 3 are normative."
>
> It seems to me that is very clear about the relationship between
> the rules and the tables.   That relationship is reiterated in
> Section 7.1.1 of RFC 5892.

s/5892/5894/

Sure.  But that does not prove (or disprove) Pete's claim that my
implementation is non-compliant.

> You could reasonably say that your implementation is conformant
> but current only to Unicode 5.2.   If you are willing to say
> that, I guess you don't need to change anything.

I claim my implementation is compliant to all requirements in RFC 5890,
RFC 5891, RFC 5892 and RFC 5893.

> While we recognize that you have no control over the Unicode version
> in use, good sense suggests that systems will update versions of
> Unicode (including all of the associated tables and support routines
> as applicable) and versions of your library together,

That is unrealistic.  Traditional operating systems are already so
complex that upgrading them to one Unicode versions across all software
pieces (Java, Perl, SQL databases, web browsers, word processors, etc)
is economically infeasible.

Modern operating system rely so much on network services that it is not
even useful to decouple the local system from external systems.
Essentially "the system" is identical to "the Internet".  A flag day to
upgrade to the latest Unicode version across the Internet is, despite
how infinitely pleasant that would be, impossible.

If it was possible to upgrade software components to the latest Unicode
version in a controlled way, the IDNA2003 model would have worked fine.

Fortunately, I believe IDNA2008 does not require tight Unicode version
synchronization.  In fact, I believe one of the features with IDNA2008
is exactly that it doesn't require all Unicode versions to be in sync in
all parts of the Internet.

> While that should be clear from the context of the discussions in RFC
> 5891 and 5892, RFC 5894 is quite explicit about it in the second
> bullet of Section 7.1.2:
>
>  "o The Unicode tables (i.e., tables of code points,
> 	character classes, and properties) and IDNA tables
> 	(i.e., tables of contextual rules such as those
> 	that appear in the Tables document), must be
> 	consistent on the systems performing or validating
> 	labels to be registered.  Note that this does not
> 	require that tables reflect the latest version of
> 	Unicode, only that all tables used on a given
> 	system are consistent with each other."

That is about registration of labels, not lookup.  Registration is a
centralized process where you can control the software used more easily.

> Similarly, the first bullet of 7.1.3 reads:

You forgot to quote the paragraph before the one you quoted:

   Any application processing a label through IDNA so it can be looked
   up in a DNS zone is required to (the exact rules appear in Section 5
   of the Protocol document [RFC5891]):

>  "o Maintain IDNA and Unicode tables that are consistent
> 	with regard to versions, i.e., unless the application
> 	actually executes the classification rules in the Tables
> 	document [RFC5892], its IDNA tables must be derived from
> 	the version of Unicode that is supported more generally on
> 	the system.  As with registration, the tables need not
> 	reflect the latest version of Unicode, but they must be
> 	consistent."

I don't see any similar text about IDNA and Unicode version consistency
requirements in section 5 of RFC 5892.  I'm sure you recall that RFC
5894 is as non-normative as the RFC 5892 tables are.

/Simon
_______________________________________________
Ietf mailing list
Ietf@xxxxxxxx
https://www.ietf.org/mailman/listinfo/ietf