--On 23. desember 2005 11:36 +0100 "Tom.Petch" <sisyphus@xxxxxxxxxxxxxx> wrote:
A) Character set. UTF-8 implicitly specifies the use of Unicode/IS10646 which contains 97,000 - and rising - characters. Some (proposed) standards limit themselves to 0000..007F, which is not at all international, others to 0000-00FF, essentially Latin-1, which suits many Western languages but is not truly international. Is 97,000 really appropriate or should there be a defined subset?
I think Ned has answered most of your other points... I'll chime in on this one.....
My opinion: ALL attempts at defining an "useful" character set of any size between 128 and "all you can eat" for use internationally have been dismal failures. They have been used in some niche, sooner or later there's a need to work outside that box, and gateways or other forms of self-torture result. (Alvestrand's equality: gateways = pain).
At the moment, the only reasonable candidate for an "all you can eat" character set is the Unicode charset. All other alternatives, including the bizarrely byzantine character set switching schemes of ISO 2022, are basically dead in the marketplace.
So there are only two real choices for charset left: ASCII and Unicode.ASCII is unsuitable for any language except the technologists' simplified version of English. So if you want text, and want it to work internationally, there's only one choice left.
Subsets are a mistake. Harald
Attachment:
pgpfW0mnSHJPj.pgp
Description: PGP signature
_______________________________________________ Ietf@xxxxxxxx https://www1.ietf.org/mailman/listinfo/ietf