On Mon, Oct 08, 2001 at 03:39:53AM +0200, Christian Rose <menthos@xxxxxxxxxxx> wrote: > Native support for UTF-8 is uncommon and of course that is bad and Sorry? my mailer supports it (mutt) my editor supports it (vim), my terminal supports it (xterm), my irc-client supports it (epic), my browser(s) suipport it (lynx, netscape, mozilla), my os supports it on the console (linux)... utf-8 support is more common than supprot for most other charsets, actually. > Editors aside, simply looking at and otherwise using console text tools > on UTF-8 files with non-ASCII content, usually has little if any > support. The same is true for anythign except ascii. Hint: you cannot represrnt the majority of languages with ascii. (and I was told emacs can do utf-8. at least people found a way to decode my mails properly in emacs). Maybe it's just that emacs can't natively edit utf-8 text, but it should be possible to just convert it to some native charset. every unix comes with iconv, and most do support utf-8 for example. > I'm sure you'll find out many other surprises when you check what text > tools in any major GNU/Linux distribution actually fully supports UTF-8, I'd say the majority does. > Sure the tools need to get updated in the end, but it's a very slow > process that has already taken years with little happening and surely > many years remain to come I realyl think you need a reality check. > have to use UTF-8 is a big practical problem for translators. Note that s/big/little/ every editor should eb able to pipe through some external program (iconv -f utf-8 -t koi8-r for russian for example) on loading/saving. And I am quite sure translators for non-ascii languages already know how to cope with charset problems - they needed to. > That still won't solve the problems: (agreed to all of them - i wa spurely concerned about utf-8 ;) > > While I do agree with Marc that XML is not the all-purpose solution I > > really think it's going to help in the case of localisation by the > > consistent use of UTF-8 and other concepts like includeable files and > > overrideable tags. XML and UTF-8 are two completely orthogonal concepts - xml is represented in unicode and can be written in almost any encoding (ascii, viscii, whatever). I don't see any problem having multiple different(!) files with different encodings, pleasing whatever a local translator likes. -- -----==- | ----==-- _ | ---==---(_)__ __ ____ __ Marc Lehmann +-- --==---/ / _ \/ // /\ \/ / pcg@xxxxxxxx |e| -=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE --+ The choice of a GNU generation | |