Re: Rationale behind UTF-8?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



tor 2002-10-10 klockan 20.46 skrev Havoc Pennington:
> 
> Robert Claeson <r.claeson@computer.org> writes:
> > I noticed that Unicode UTF-8 is now the default encoding when most
> > Western Europeans locales are selected. Since some ISO 8859 character
> > set is usually the norm for those locales, I would be interested in the
> > rationale behind Psyche using UTF-8 rather than ISO 8859.
> > 
> 
> The reasons for Unicode include:
> 
>  - so you can use multiple languages at once in a document
> 
>  - so that programs can write a single generic algorithm 
>    for say word breaking, instead of special-casing each 
>    locale
> 
>  - because most of the modern apps (all Qt, GTK apps, most scripting
>    languages, etc.) are using Unicode internally, so using it
>    externally speeds things up
> 
>  - so that Chinese/Japanese/Korean are going through the same
>    codepaths as European languages, so that there are fewer
>    CJK-specific issues. (Of course we don't default to UTF-8 for CJK
>    yet, but it's coming.)
> 
>  - because the filesystem needs to be in UTF-8 unless all users
>    of a system are using the same language exclusively
> 
> FWIW, the issues people are seeing with UTF-8 are almost all things
> that Asian users have been living with for years... now everyone's in
> the same boat, let's patch the leaks. ;-)

Makes sense. I remember those days when "we" (us Europeans) had all
kinds of issues with using 8 bit characters in BSD 4.3 and SVR3, which
had all kinds of problems with the high bit. I guess my age is showing.
:-)

/Robert





[Index of Archives]     [Fedora General Discussion]     [Red Hat General Discussion]     [Centos]     [Kernel]     [Red Hat Install]     [Red Hat Watch]     [Red Hat Development]     [Red Hat 9]     [Gimp]     [Yosemite News]

  Powered by Linux