tor 2002-10-10 klockan 20.46 skrev Havoc Pennington: > > Robert Claeson <r.claeson@computer.org> writes: > > I noticed that Unicode UTF-8 is now the default encoding when most > > Western Europeans locales are selected. Since some ISO 8859 character > > set is usually the norm for those locales, I would be interested in the > > rationale behind Psyche using UTF-8 rather than ISO 8859. > > > > The reasons for Unicode include: > > - so you can use multiple languages at once in a document > > - so that programs can write a single generic algorithm > for say word breaking, instead of special-casing each > locale > > - because most of the modern apps (all Qt, GTK apps, most scripting > languages, etc.) are using Unicode internally, so using it > externally speeds things up > > - so that Chinese/Japanese/Korean are going through the same > codepaths as European languages, so that there are fewer > CJK-specific issues. (Of course we don't default to UTF-8 for CJK > yet, but it's coming.) > > - because the filesystem needs to be in UTF-8 unless all users > of a system are using the same language exclusively > > FWIW, the issues people are seeing with UTF-8 are almost all things > that Asian users have been living with for years... now everyone's in > the same boat, let's patch the leaks. ;-) Makes sense. I remember those days when "we" (us Europeans) had all kinds of issues with using 8 bit characters in BSD 4.3 and SVR3, which had all kinds of problems with the high bit. I guess my age is showing. :-) /Robert