Re: Gen-ART LC review of draft-ietf-eai-utf8headers-09.txt

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Spencer Dawkins skrev:
> Hi, Harald,
>
> Thanks for the quick feedback (Gen-ART reviewers like this because we 
> can remember writing the review, and at least part of what we were 
> thinking about :-)
>
> Looks like mostly goodness. If we're in synch, I dropped it from this 
> e-mail.
>
> Spencer
>
>
>>> 1.2.  Relation to other standards
>>>
>>>   This document also updates [RFC2822] and MIME, and the fact that an
>>>   experimental specification updates a standards-track spec means that
>>>   people who participate in the experiment have to consider those
>>>   standards updated.
>>>
>>> Process: The ID Tracker is showing this draft in Last Call status, 
>>> but I
>>> can't find (in the archive or in my personal folders) any Last Call
>>> announcement, which I was looking for, in order to check how Chris 
>>> explained
>>> the downref at Last Call time - I'm expecting that it will be quite
>>> entertaining. Has anyone else seen such an announcement on IETF 
>>> Announce?
>> Note: Intended status is Experimental.
>>
>> The subject line of the Last Call was
>>
>> Last Call: draft-ietf-eai-smtpext (SMTP extension for 
>> internationalized email address) to Experimental RFC
>>
>> and covered 2 drafts; this may be why you did not find it.
>
> Exactly right (I was scanning by subject). While I'm amazed that the 
> downref isn't being called out in the Last Call announcement, I think 
> RFC tracks and standards levels are so arbitrary that they are 
> useless, so I'm not complaining - I was trying to figure out if there 
> really had been a Last Call announcement sent, that's all.
I actually don't see a downref here - this is an Experimental updating a 
Draft Standard (or Full; I don't remember current status well). If 
anything, this is unusual as an upref, not a downref....
>
>>> 4.  Changes on Message Header Fields
>>>
>>>   This protocol does NOT change the definition of header field names.
>>>
>>> technical: I'm confused here. Is this text saying "does not change 
>>> header
>>> field names"? I would have thought this specification is exactly 
>>> changing
>>> the definition of header field names...
>> It does not change the definition of header field NAMES (which remain 
>> ASCII), but changes the definition of header field BODIES (which used 
>> to be ASCII, but are now UTF-8).
>>>
>>>   That is, only the bodies of header fields are allowed to have UTF-8
>>>   characters; the rules in [RFC2822] for header field names are not
>>>   changed.
>> And this sentence is saying that. How can we express this more clearly?
>
> Ah. You filled in the missing piece for me here. Perhaps something like
>
> "This protocol does NOT change the [RFC2822] rules for defining header 
> field names. The bodies of header fields are allowed to contain UTF-8 
> characters, but the header field names themselves must contain ASCII 
> characters."
That seems like a good editorial suggestion to me. Thanks!
>
>>>   Interoperability considerations:  The media type provides
>>>      functionality similar to the message/rfc822 content type for email
>>>      messages with international email headers.  When there is a need
>>>      to embed or return such content in another message, there is
>>>      generally an option to use this media type and leave the content
>>>      unchanged or downconvert the content to message/rfc822.  Both of
>>>      these choices will interoperate with the installed base, but with
>>>      different properties.  Systems unaware of international headers
>>>      will typically treat a message/global body part as an unknown
>>>      attachment, while they will understand the structure of a message/
>>>      rfc822.  However, systems which understand message/global will
>>>      provide functionality superior to the result of a down-conversion
>>>      to message/rfc822.  The most interoperable choice depends on the
>>>      deployed software.
>>>
>>> technical: not sure what the last sentence actually means. "We don't 
>>> know
>>> what the most interoperable choice will be"? Text in the same 
>>> paragraph says
>>> both choices are interoperable. If that text is correct, I don't 
>>> understand
>>> what you're saying here.
>> Would it be better to say "the most useful choice"? It's likely to be 
>> the difference between a compliant MUA offering to dump the message 
>> to a file and displaying it as a message...
>
> "The most useful choice" seems very reasonable. The current text seems 
> to contradict other text in the same paragraph.
>
>>> 5.  Security Considerations
>>>
>>>   Because UTF-8 often requires several octets to encode a single
>>>   character, internationalized local parts may cause mail addresses to
>>>   become longer.  As specified in [RFC2822], each line of characters
>>>   MUST be no more 998 octets, excluding the CRLF.
>>>
>>> clarity: s/CRLF/CRLF, even when UTF-8 characters are being used/
>>>
>>>   Because internationalized local parts may cause email addresses to be
>>>   longer, processes which parse, store, or handle email addresses or
>>>   local parts must take extra care not to overflow buffers, truncate
>>>   addresses, exceed storage allotments, or, when comparing, fail to use
>>>   the entire length.
>>>
>>> technical: this is great advice, but I don't understand how UTF-8 
>>> changes
>>> the situation. If you aren't changing the 998-octet requirement, 
>>> software
>>> that breaks for UTF-8 would also break for ASCII headers with the 
>>> same octet
>>> length.
>> If someone uses another representation internally (for instance 
>> UTF-16), and has a 998-character buffer, that will sometimes fit into 
>> 998 octets of UTF-8, and sometimes not. The same goes in the other 
>> direction.... I'm sure others will think of other cases.
>
> Thanks for the clear explanation here. This is headed in the right 
> direction - I wasn't impressed with guidance that says "take extra 
> care", but saying "must accommodate 998 characters (which may require 
> more than 998 octets, depending on the character set in use), and must 
> not overflow buffers, ..." seems clear enough to me.
I think it's more like "must accomodate 998 octets, and not send more 
than 998 octets, even though the relationship between this number and 
the number of UTF-8 characters is not a simple one". I see that Klensin 
has picked up on this for 2821, too.

Thanks for the review!

                Harald
_______________________________________________
IETF mailing list
IETF@xxxxxxxx
https://www.ietf.org/mailman/listinfo/ietf

[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Fedora Users]