On Sun December 5 2004 13:36, McDonald Ira wrote: > Hi, > > Relative to Bruce's suggestion that the 40 character restriction > in names applies only to MIBs: > > (1) MIBs in both SMIv1 and SMIv2 have always supported the ASN.1 > standard maximum of 63 characters for identifiers > > (2) But, due to underlying linker restrictions, _many_ MIB compilers > truncate identifiers at 31 characters (or arbitrarily rewrite > them after about 25 characters) > > (3) So 40 characters isn't a helpful restriction for MIB names. [I'm copying the ietf-822 list as the issue(s) discussed affect MIME and the Internet Message Format; responses to the charset-specific part should remain on ietf-charsets. I'm also copying the ietf and ietf-languages lists where a related discussion about language tags is taking place.] To date, I have merely pointed out that the registration for MIME names imposes no upper bound, but that the MIB requirements do indicate a limit for the cs* aliases. I have not stated whether I thought that there should be an explicit limit in general. It is now time to speak up on that matter. I am prompted to do so by considerations arising from a proposal to replace RFC 3066, which defines language tags and their registration procedure. Charset names and language tags are connected by way of RFC 2231, which amended RFC 2047's definition of "encoded-word" to include provision for a language tag. An encoded-word has the form (my representation, not the official one; for the latter consult RFC 2231 and errata): =?<charset>*<language-tag>?<encoding>?<text>?= The text part must be at least 4 octets in order to accommodate B encoding restrictions. Encodings are currently represented by a single octet, and as encodings are intended to be limited in number, let's assume that that will suffice indefinitely. That leaves a maximum of 63 octets for the total length of the charset name and the language-tag. RFC 2978 (charset name registration) provides a procedure for review, so while the charset name could theoretically be infinite in length, the review process is expected to catch cases which would prove problematical for encoded-words -- in fact, so far as I can determine, the longest charset name suitable for use in an encoded-word (i.e. charsets suitable for text/plain, considering the preferred MIME name where specified, otherwise the primary name) has a length of 45 octets. RFC 2231 also provides for charset specification in extended parameters used with Content-Type and Content-Disposition fields; these are not required to be charsets suitable for text/plain, and the combined length of charset and language tag length is much greater than that in an encoded-word (but still finite). Under RFC 3066, there is a similar registration and review procedure, and while again there is the theoretical possibility of a very long language tag, the longest such registered tag has a length of 11 octets. Combined, the longest charset and longest language tag total 56 octets, which is less than the 63 octet limit imposed by encoded-word syntax. Unregistered, private-use charset and/or language-tags could of course be longer; that does not concern me. Private-use requires coordination between communicating parties, and it is a matter for those parties to agree on private-use tags that fit within the relevant limits. There is a draft proposal for a replacement of RFC 3066 which would decouple non-private-use language tag use from the review/registration procedure and which would provide for infinite length non-private-use language tags. That not only represents a problem for encoded- word use, but it is a problem for Internet Message Format header (message- and MIME-part) fields which use language tags, such as RFC 3282's Content-Language and Accept-Language. A "New Last Call" has been issued for the draft proposal on the ietf-announce list: http://www1.ietf.org/mail-archive/web/ietf-announce/current/msg00755.html RFC 2047 gives rationale for the encoded-word limit, and the Message Format limit can be found in RFCs 2821 and 2822. Given the large deployed base of software implementing those core Internet protocols, I do not forsee an opportunity to increase the encoded-word length limit at this time. Consequently, the maximum total for registered charset and language tags remains at no more than 63 octets (and it is conceivable that future encodings might require a longer text portion). I suggest that charset names and aliases be limited to the current maximum of 45 octets, and that language-tags for use in encoded-words and extended parameters be limited to 16 octets (an increase of 45% over the longest registered language tag). That leaves but 2 octets of expansion room for encoding tags and/or encoding-driven restrictions on the encoded text. Ideally, a lower limit for MIME charset names would be used; aside from a couple of pathological cases, most MIME-compatible charsets names registered are 17 octets or less in length; many have shorter aliases. However, establishing a limit lower than the longest currently- registered name would require extraordinary action. It might be possible to assign MIME-preferred-name aliases to the excessively-long registered charset names, for example. However, the overall maximum (regardless of whether the charset is compatible with MIME text/plain) should probably be held at 45 octets. As for the MIB- specific aliases, I'll leave specific recommendations up to others, but 45 octets is certainly capable of accommodating the current MIB-specific limit of 40 octets. _______________________________________________ Ietf@xxxxxxxx https://www1.ietf.org/mailman/listinfo/ietf