Re: Last Call: <draft-ietf-dane-openpgpkey-07.txt>

E Taylor <hagfish@xxxxxxxxxxxx> · Mon, 15 Feb 2016 08:36:36 +0000

Hello,

Thank you, John, for your detailed comments on the i18n aspect of this
draft, which I admit I hadn't fully considered.  I think you're right
that, whatever approach is taken, it would make sense to add a short
"Internationalization Considerations" section to state what the expected
interaction is between this specification and non-ASCII addresses.

More comments inline below:

> Temporarily and for purposes of discussion, assume I agree with
> the above as far as it goes (see below).   Given that, what do
> you, and the systems you have tested, propose to do about
> addresses that contain non-ASCII characters in the local-part
> (explicitly allowed by the present spec)?  Note that lowercasing
> [1] and case folding are different and produce different results
> and that both are language-sensitive in a number of cases, what
> specifically do you think the spec should recommend?  

I have not seen any specific examples of software which unintentionally
converts characters to uppercase (although I can readily imagine such
bugs/features), so I'm prepared to assume that the lowercasing logic can
be safely limited to just the input strings which include only ASCII
characters.  My idea was for the client to make a reasonable effort to
correct for a plausible (but rare) problem, so for the purposes of an
experiment I think it is acceptable if this correction does not try
anything more clever, like converting MUSTAFA.AKINCI@xxxxxxxxxxx to
mustafa.akıncı@example.com (although mustafa.akinci@xxxxxxxxxxx should
be tried).

> Also, do you think it is acceptable to publish this document
> with _any_ suggestions about lower-casing or "try this, then try
> something else" search without at least an "Internationalization
> Considerations" section that would discuss the issues [1] and/or
> some more specific recommendation than "try lowercase" (more on
> that, with a different problem case, below).

You are right that adding such a section could be of great benefit to at
least some implementers, even if the discussion in that section is
simply "Only try lower-casing when the input is all ASCII".  If someone
can come up with something more helpful than that brief statement, then
I'd be very supportive of it.

> Dropping that assumption of agreement for discussion, I
> personally believe that this document could be acceptable _as an
> Experimental spec_ with any of the following three models, but
> not without any of them:
>
>  (i) The present "MUST not try to guess" text.
>
>  (ii) A recommendation about lowercasing along the lines
> 	you have outlined but with a clear discussion of i18n
> 	issues and how to handle them [2].
>
>  (iii) A clear statement that the experiment is just an
> 	experiment and that, for the purposes of the experiment,
> 	addresses that contain non-ASCII characters in the local
> 	part are not acceptable (note that would also require
> 	pulling the UTF-8 discussion out of Section 3 and
> 	dropping the references to RFC 6530 and friends).

Perhaps you would settle for an option (ii.v) which is my lowercasing
recommendation + a discussion of the i18n issues + that discussion being
based on the experimental restriction of only applying the lowercasing
logic to ASCII-only local parts.  I hope that would be in keeping with
your sensible suggestions above.

> ...
> e.g., 
>    U+0066 U+006F U+0308 U+006F   and
>    U+0066 U+00F6 U+006F
> are perfectly good (and SMTPUTF8-valid) representations of the
> string "föo"    
>
> Using the same theory as your lower case approach, would you
> recommend trying first one of those and then the other [3]?

That is tempting, but I accept that it may be too much unnecessary
complexity to suggest or recommend it at this stage of the experiment. 
I know that various ideas have been proposed for handling normalisation
of local-parts more generally, and I think we should allow that work to
progress separately, uncoupling it from the document at hand.

> The more I think about it, the more I'm convinced that the
> specification and allowance for UTF-8 [4] in the first bullet of
> Section 3 is unacceptable without either text there that much
> more carefully describes (and specifies what to do about) these
> cases or an "Internationalization Considerations" section that
> provides the same information.  I suggest that anyone
> contemplating writing such text carefully study (not just
> reference) Section 10.1 of RFC 6530.   Of course, simply
> excluding non-ASCII local-parts from the experiment, as
> suggested in (iii) above, would be an alternative.  I have mixed
> feelings about whether it would be an acceptable one for an
> experiment.  I am quite sure it would not be acceptable for a
> standards-track document when the EAI work and/or the IETF
> commitment to diversity are considered.

I think that excluding non-ASCII local-parts from just the extra
lower-casing logic, and pointing out the complexity of case handling in
non-ASCII contexts in a separate section as you have suggested, might
address the outstanding concerns, without hindering diversity.

> ...
> [2] I note that, historically, the DNS community has been very
> reluctant to accept techniques that depend on or imply multiple
> lookups for a single perceived object and, separately, for
> "guess at this, try it, and, if that does not work, guess at
> something else" approaches.  Unless those concerns have
> disappeared, the potential for combinatorial explosion when
> lower-casing characters that may lie outside the ASCII
> repertoire is truly impressive.

That's another reasonable point, thank you.  Hopefully it is mitigated,
at least for the most part, by settling for only lower-casing characters
for all-ASCII local-parts, avoiding the combinatorial explosion you
mention.  Also, this extra lower-casing step will only happen in the
relatively rare situations where the input local-part contains at least
one upper-case character (although I don't know in practice how many
extra lookups that will lead to, on average).

Best regards,
Edwin