Hi - Could explain the connection Bruce sees between the limit on the length on descriptors used in writing MIB modules and the tags used for identifying character sets? I thought I understood MIB compiler issuess fairly well, but I seem to be missing something here, as I just can't see how the MIB compiler constraints are relevant. Randy > From: "Bruce Lilly" <blilly@xxxxxxxxx> > To: <ietf-charsets@xxxxxxxx>; <ietf-822@xxxxxxx> > Cc: <ietf@xxxxxxxx>; <ietf-languages@xxxxxxxxxxxxx> > Sent: Sunday, December 12, 2004 2:10 PM > Subject: Charset name length(s) > > On Sun December 5 2004 13:36, McDonald Ira wrote: > > Hi, > > > > Relative to Bruce's suggestion that the 40 character restriction > > in names applies only to MIBs: > > > > (1) MIBs in both SMIv1 and SMIv2 have always supported the ASN.1 > > standard maximum of 63 characters for identifiers > > > > (2) But, due to underlying linker restrictions, _many_ MIB compilers > > truncate identifiers at 31 characters (or arbitrarily rewrite > > them after about 25 characters) > > > > (3) So 40 characters isn't a helpful restriction for MIB names. > > [I'm copying the ietf-822 list as the issue(s) discussed > affect MIME and the Internet Message Format; responses > to the charset-specific part should remain on ietf-charsets. > I'm also copying the ietf and ietf-languages lists where > a related discussion about language tags is taking place.] > > To date, I have merely pointed out that the registration > for MIME names imposes no upper bound, but that the MIB > requirements do indicate a limit for the cs* aliases. I > have not stated whether I thought that there should be an > explicit limit in general. It is now time to speak up on > that matter. > > I am prompted to do so by considerations arising from a > proposal to replace RFC 3066, which defines language tags > and their registration procedure. Charset names and > language tags are connected by way of RFC 2231, which > amended RFC 2047's definition of "encoded-word" to include > provision for a language tag. An encoded-word has the > form (my representation, not the official one; for the > latter consult RFC 2231 and errata): > > =?<charset>*<language-tag>?<encoding>?<text>?= > > The text part must be at least 4 octets in order to accommodate > B encoding restrictions. Encodings are currently represented > by a single octet, and as encodings are intended to be limited > in number, let's assume that that will suffice indefinitely. > That leaves a maximum of 63 octets for the total length of the > charset name and the language-tag. RFC 2978 (charset name > registration) provides a procedure for review, so while the > charset name could theoretically be infinite in length, the > review process is expected to catch cases which would prove > problematical for encoded-words -- in fact, so far as I can > determine, the longest charset name suitable for use in an > encoded-word (i.e. charsets suitable for text/plain, considering > the preferred MIME name where specified, otherwise the primary > name) has a length of 45 octets. > > RFC 2231 also provides for charset specification in extended > parameters used with Content-Type and Content-Disposition > fields; these are not required to be charsets suitable for > text/plain, and the combined length of charset and language > tag length is much greater than that in an encoded-word > (but still finite). > > Under RFC 3066, there is a similar registration and review > procedure, and while again there is the theoretical > possibility of a very long language tag, the longest such > registered tag has a length of 11 octets. > > Combined, the longest charset and longest language tag > total 56 octets, which is less than the 63 octet limit > imposed by encoded-word syntax. > > Unregistered, private-use charset and/or language-tags > could of course be longer; that does not concern me. > Private-use requires coordination between communicating > parties, and it is a matter for those parties to agree > on private-use tags that fit within the relevant limits. > > There is a draft proposal for a replacement of RFC 3066 > which would decouple non-private-use language tag use > from the review/registration procedure and which would > provide for infinite length non-private-use language > tags. That not only represents a problem for encoded- > word use, but it is a problem for Internet Message > Format header (message- and MIME-part) fields which use > language tags, such as RFC 3282's Content-Language and > Accept-Language. A "New Last Call" has been issued > for the draft proposal on the ietf-announce list: > http://www1.ietf.org/mail-archive/web/ietf-announce/current/msg00755.html > > RFC 2047 gives rationale for the encoded-word limit, > and the Message Format limit can be found in RFCs 2821 > and 2822. Given the large deployed base of software > implementing those core Internet protocols, I do not > forsee an opportunity to increase the encoded-word > length limit at this time. Consequently, the maximum > total for registered charset and language tags remains > at no more than 63 octets (and it is conceivable that > future encodings might require a longer text portion). > I suggest that charset names and aliases be limited to > the current maximum of 45 octets, and that language-tags > for use in encoded-words and extended parameters be > limited to 16 octets (an increase of 45% over the > longest registered language tag). That leaves but 2 > octets of expansion room for encoding tags and/or > encoding-driven restrictions on the encoded text. > > Ideally, a lower limit for MIME charset names would > be used; aside from a couple of pathological cases, most > MIME-compatible charsets names registered are 17 octets > or less in length; many have shorter aliases. However, > establishing a limit lower than the longest currently- > registered name would require extraordinary action. It > might be possible to assign MIME-preferred-name aliases > to the excessively-long registered charset names, for > example. However, the overall maximum (regardless of > whether the charset is compatible with MIME text/plain) > should probably be held at 45 octets. As for the MIB- > specific aliases, I'll leave specific recommendations up > to others, but 45 octets is certainly capable of > accommodating the current MIB-specific limit of 40 octets. > _______________________________________________ > Ietf-languages mailing list > Ietf-languages@xxxxxxxxxxxxx > http://www.alvestrand.no/mailman/listinfo/ietf-languages _______________________________________________ Ietf@xxxxxxxx https://www1.ietf.org/mailman/listinfo/ietf