* John C. Klensin: > --On Wednesday, August 26, 2020 20:33 +0200 Florian Weimer > <fw@xxxxxxxxxxxxx> wrote: > >>... >>> If the charset name code you are interested in registering is >>> defined in an RFC, you are presumably want the first, but it >>> would be helpful for you to confirm that. >> >> The charset is currently unnamed, as far as I can see: it's >> the UTF-7 variant in section 5.1.3 of RFC 3501. > > I have not been following IMAP work carefully in recent years > (others here may be able to easily fill in the blanks) but my > impression is that, as Unicode encoded in UTF-8 has taken over > as the generally preferred form for transmission of non-ASCII > characters over the Internet, UTF-7 has generally fallen out of > use even if it has not been explicitly deprecated. In > particular, its use is strongly discouraged in fully > internationalized email contexts ... see the discussion in RFC > 6855 for IMAP and RFC 6530ff for the more general case. I see. Is RFC 6855 widely implemented? The IMAP server at work does not seem to offer the UTF8=ACCEPT capability, if I read the debug output from the mail client correctly. <https://imapwiki.org/Specs> does not list any compatible implementations, if I read that table correctly. >> The charset is already defined and widely implemented. It's a >> transformation format of Unicode. > > But a transformation form that was invented in the IETF, more or > less specifically for one particular element of IMAP, not one > standardized or recognized by the Unicode Consortium. That > might be hair-splitting except for the problem below. > >> It's just presently unnamed. >> Having a standardized name for it would help implementing >> support in the POSIX iconv framework. > > Ah. Up to you -- and something to be worked out with the > designated experts and on the appropriate lists (once it is > sorted out) -- but some free (and maybe worth what you pay for > it) advice: you'd get much better support and interoperability > by dropping UTF-7 and converting the relevant applications to > use UTF-8. That is particularly important because there are > apparently several web applications and form processors that > will ignore the charset parameter (either entirely or with any > values they don't recognize) and try to deduce the charset or > encoding form heuristically. And they will not accurately > detect UTF-7 for reasons that should be obvious. It seems to me that we need this UTF-7 variant for a few years to come. Implementations already have come up with ad-hoc names such as "utf-7-imap" or "mUTF-7". Telling implementors to use UTF-8 instead, when it does not provide interoperability with existing software, does not seem particularly useful to me.