John,
I can't feel but that there's somewhat of a disconnect between what the
document is actually about and the issues that are being discussed here.
The document is not about policy. It is also not about domain names
exclusively. The same is true for RFC7940 by the way.
RFC7940 is about how you write down a specification for label generation
rules. It is not a specification for designing them. It is also not a
specification
for what should be possible in label generation rules for IDNs in the DNS.
In other words, it is a protocol for capturing a wide range of possible
policies for a range of possible identifier systems, of which domain names
are the most prominent (and first) example.
Variant mechanisms exist today in at least two different RFCs, one for
Arabic and one for Chinese. RFC 7940 provides the tools to express them
rigorously, without passing any judgement on the feasibility or desirability
of defining a policy that allows variants.
The XML formalism is hard to follow for many people and obscures to them
what is happening underneath. Because it is broad enough to cover all pre-
existing schemes, it is also possible to create policies with edge-cases
that can
get you in trouble, where a label generation ruleset may not be
"well-behaved"
in an implementation sense. The current draft is aimed at explaining these
technical aspects, again without addressing the question whether variants
are desirable for the DNS.
There's nothing here that sets policy for the DNS. If that was desirable, it
should be done in a different document. I concur, that such a document
would be the correct place to address questions around the issues of what
is implementable in terms of *delegated* variants in the DNS, and even
that such a document would seem desirable. I fully agree with what is
described as the "magical thinking" around variants. But it is not something
addressed in this draft and not the place of this draft to discuss any of
these issues, and they should not be part of this draft.
RFC7940 is an XML schema. The current draft doesn't change anything
about this schema, therefore, it is not, in my view, material for an
"update"
to RFC7940.
If you think of RC7940 like a programming language for LGRs, the current
document is a style guide. These are two entirely different things. Whether
using standards-track for a style guide is appropriate, I have no opinion
on, but I am clear that this draft neither updates nor supercedes any part
of RFC 7940.
The JET documents contain a combination of information: how to capture
a variant relationship in a multi-column plain text format on the one hand,
and what to base the relationship on, on the other hand.
RFC7940 only (!) addresses the first point. It does not address what variant
relations are desirable (or should be allowed) in any zone in the DNS.
The current document goes a step further and examines what kind of
assumptions about the nature of supportable variants go into the design
of the LGR formalism. Contrary to what is claimed, the draft is intended
to make clear that "variants" that are not 1:1 substitutions are effectively
not tractable with this mechanism.
If you have a relationship between two labels that can be characterized
by a distance in perceptual space, and where you define some arbitrary
but non-zero distance as defining confusability between such labels, then
of three labels, two may be confusable with the third, but not confusable
with each other: the relationship is not transitive.
Non-transitive relationships are not handled well with RFC7940, because
it does not have a way of allocating distances (or locations) in perceptual
space. (Language to that effect is definitely in the latest draft I
wrote, but
I don't know whether it's already in the -02).
If your relationships between blocked variants are symmetric and transitive,
collision checking becomes an 0(1) operation, somewhat like a hash.
This makes blocked variants attractive for cases where, in case of a
collision,
there is no doubt that the labels are clearly colliding. (Collisions
based on
perceptual distance suffer from the arbitrary selection of the minimal
required distance and are always open to pressures to override or make
case-by-case exceptions, see .br).
One could capture the confusables data in UTS#46 in the form of RFC7940
and mechanically make the transitive and symmetric (as they are not
specified that way, although they appear to at least be intended to be
implicitly symmetric). If that is done, one finds that the transitivity
requirement would lead to mapping a number of clearly distinct labels
to each other; for others, the variant relation could be transitive as well.
RFC7940 allows symmetric only LGRs, and with suitable tools these
could be used to implement blocking, but the optimizations available
for the transitive case would not hold. (That is something that could
be described in a section in the draft, if it was considered helpful;
however, this author has no information on best strategies for implementing
blocking in a symmetric only case).
The Arabic RFC defined positional variants (a necessary feature for that
script) but the context rules for variants that this requires can lead to
undesirable edge cases when one violates the underlying assumption
of symmetry: if I substitute a variant code point in a label it must satisfy
the same context rule, otherwise the mapping is not symmetric.
(The types of context rules one tends to use in Arabic happen to be well-
behaved)
RFC7940 does not require variant relations to be symmetric and transitive;
it is not unreasonable to be able to have a tool to mechanically complete
the specifications of mappings, but to want the input file to be formally
valid under the XML schema, even if in practice one would want to not
use an LGR that isn't symmetric and transitive.
The work required to prove that an XML is symmetric and transitive
as far as the mappings are concerned, is practically the same as enforcing
that constraint, by the way.
The notation used in this draft is simply a shorthand for the formalisms
available in RFC7940, so that it is possible to succinctly write down
examples that aren't overburdened with XML syntax. Really nothing more
and nothing less. If RFC7940 were extended in the future, so could this
symbolic shorthand notation.
I believe it is factually incorrect to say that either RFC7940 or this
draft are
in any way constrained to what "ICANN has decided to allow for the Root".
The expert group hired by ICANN has followed some of the reasoning found
in this draft to recommend against some details in certain proposed
definitions
of variants, because they were realized to lead to ambiguous edge cases.
The suggestion that the section showing the correspondence is too cursory
I take as constructive. There's no requirement that prevents it from being
more comprehensive.
A./
On 2/14/2017 1:59 AM, John C Klensin wrote:
--On Tuesday, January 17, 2017 09:23 -0800 The IESG
<iesg-secretary@xxxxxxxx> wrote:
The IESG has received a request from an individual submitter
to consider the following document:
- 'Variant Rules'
<draft-freytag-lager-variant-rules-02.txt> as Informational
RFC
The IESG plans to make a decision in the next few weeks, and
solicits final comments on this action. Please send
substantive comments to the ietf@xxxxxxxx mailing lists by
2017-02-14. Exceptionally, comments may be sent to
iesg@xxxxxxxx instead. In either case, please retain the
beginning of the Subject line to allow automated sorting.
Summary: This document should not be published in the IETF
Stream, at least in its present form, proposed status, and
relationship to other documents. An explanation and some
alternatives appear below.
Details:
This is a difficult document for me to review for multiple
reasons, including a conviction that the intentions are the
very best but that the document is artificially constrained by
essentially political decisions taken outside the IETF or any
other process that would meet traditional IETF criteria for
openness, transparency, fairness, and rough consensus.
For IETF decision-making, a largely procedural issue is as, or
perhaps more, important. The document bears a relationship to
the standards-track RFC 7940 that is confusing at best. The
"Document Quality" section of the proposed approval notice
starts "The document largely reflects experience gathered from
implementing RFC 7940 and creating rulesets based on it". That
is a worthy goal and entirely consistent with the "rough
consensus and running code" principle. The author has done
what I believe is a laudable job of coming up with a
semi-mathematical and testable alternative to the rather
lengthy, complex, and less easily validated and texted, XML of
RFC 7940.
If the document were submitted to the ISE as a "I think this
would be a better way to do things while meeting the same goals
as RFC 7940" or even as "this is what ICANN is doing (or
proposes to do) and the community should know about it" piece,
I'd have little or no objection to it. However, as an IETF
stream document that apparently is intended to replace (or at
least provide an alternative to) large sections of 7940 with a
different strategy, either
-- it should be a standards-track document that explicitly
updates and replaces those portions of 7940, with all of
the documentation and explanation the IESG requires of such
updates. OR
-- it should be a standards-track document that provides an
alternative to 7940 and that explains the choices and
tradeoffs.
As an IETF Stream Informational specification, it is an
apparent IETF Informational document that encourages the
practice of something other than an IETF Proposed Standard that
addresses the same topics and requirements, with no
Applicability Statement or other guidance as to when or if it
should be applied.
I do not believe it should be published on that basis.
Without descending into details and nitpicking, there are also a
pair of technical problems.
(1) As RFC 7940 points out, the concept of "variant" and hence
the relationships needed to express the "increased
requirements of contemporary IDN variant policies" [RFC7940,
Section 9] has moved considerably beyond the definition of that
term and concept in RFC 3743 (aka "the JET specification").
This document further refines the description of those
relationships. However, if one is going to move beyond the JET
concept -- one tailored to the relationship between Simplified
and Traditional Chinese characters and not about, e.g., visual
confusion at all -- it is not clear that there is a technical
basis for saying "these things are variants and those others
are not". The "others" can include synonyms, translations,
orthographic variations that cannot be expressed in simple
character (or even character sequence) mappings, and so on.
ICANN made a serious of decisions (IMO, some of them almost by
accident and others by side-effect or more political reasons)
as to what kinds of relationships might be considered variant
candidates, at least for the root zone. The grammar proposed
in this document (and the one of 7940) exclude those other,
non-ICANN-sanctioned, relationships and cannot, in general,
represent them in spite of the fact that they might be quite
appropriate (as least for blocking) in non-root zones and have
been used in exactly that way (indeed, two of the key areas of
friction between IDNA2008 and Unicode UTS #46 can be seen in
exactly those terms). To a certain extent, that is a
criticism of 7940 rather than this document, but there is an
important difference, at least IMO: The grammar of 7940 is
essentially descriptive and, like most good XML structures,
could easily be expanded with additional elements or element
components if the need arose. It is far less clear how one
would expand a quasi-mathematical grammar, especially one that
is heavily dependent on special operator symbos and strict
typologies, like this one. Even if one were to figure out how
to expand this as requirements evolve outside ICANN's control,
such extensions would raise questions of how to keep this
document and 7940 synchronized. That issue might be another
reason for standards track status and either a more explicit
discussion of relationships and mapping; one or more IANA
registries or operators, types, and elements; or both.
(2) Independent of the web and the convenience of HTTP redirect
facilities, it is not only not clear how to implement delegated
variants in a way taht is not damaging to the Internet and the
DNS. The opportunities for combinatorial explosion and
consequent operational and zone management problems in all but
a few very special cases (including, historically, the one that
at least some of the JET designers had in mind) are
considerable. The ICANN solution of "just delegate them all to
the same party and make it their problem" may be satisfactory
from their corporate point of view, but is not a way to make
the Internet work better, especially in the context of
potentially hundreds of names that have to be kept synchronized
in a way that leads to consistent behavior across all
protocols. RFC 7940 exposes some of this problem as well, but,
again, is rather more descriptive than this document, which
moves much closer to a set of executable tests that establish
what is valid (and presumably reasonable) and what is not.
I, and a few others, have suggested in other contexts that
"variant" has become part of a magical ritual in which one looks
at a complex DNS-related problem, solemnly chants the word a
propitious number of times, and the problem is then assumed to
disappear or be solved. The issues above are only a few of the
cases to which variations on that ritual have been applied.
Unless the IETF has better solutions than magic, it should not
be legitimizing the magical thinking by publishing documents
that appear to encourage delegation of "variants".
Two additional nits, the first an important procedural one and
both to provide illustrative examples that this document would
need work even if none of the considerations above applied.
(i) At least since RFC 3552, we have not allowed documents in
the IETF stream that say "There are no security considerations
for this memo.". And yet that is exactly what Section 18 has
to say.
(ii) Section 16 ("Corresponding XML Notation") is a good idea,
but, rather than providing a comprehensive mapping, it
essentially says "here are some examples of the mapping between
notations; everything else is left as an exercise for the
reader". I think that is confusing and unfortunate but, if it
really is what the IETF and the author want to do, it should be
made much more explicit.
thanks,
John Klensin