Robert Claeson <r.claeson@computer.org> writes: > I noticed that Unicode UTF-8 is now the default encoding when most > Western Europeans locales are selected. Since some ISO 8859 character > set is usually the norm for those locales, I would be interested in the > rationale behind Psyche using UTF-8 rather than ISO 8859. > The reasons for Unicode include: - so you can use multiple languages at once in a document - so that programs can write a single generic algorithm for say word breaking, instead of special-casing each locale - because most of the modern apps (all Qt, GTK apps, most scripting languages, etc.) are using Unicode internally, so using it externally speeds things up - so that Chinese/Japanese/Korean are going through the same codepaths as European languages, so that there are fewer CJK-specific issues. (Of course we don't default to UTF-8 for CJK yet, but it's coming.) - because the filesystem needs to be in UTF-8 unless all users of a system are using the same language exclusively FWIW, the issues people are seeing with UTF-8 are almost all things that Asian users have been living with for years... now everyone's in the same boat, let's patch the leaks. ;-) Havoc