On Thursday 10 September 2009 07:54:25 Patrick Nagel wrote: > The real problem with charsets and encodings is, that you always have to > tell the interpreting program (Browser, Mail/News reader, ... whichever > program wants to show the bits from the net in a readable form) which > Charset (and encoding) has actually been used to encode the message, so > that it can choose the matching decoder. > > If this information is not given, there is no other way than guessing. And > everybody knows that computers are not good at that. How would a computer > know how the string 'äëïöüñ' from James should actually look like, if he > hadn't had specified the encoding in the header (open the source code of > his mail, and you will see the following line: Content-Type: text/plain; > charset="iso-8859-1"). The computer could then (for example) have guessed > that those bits were supposed to mean "潆秭" ("eddy billion" in Chinese)... > Ok, I admit, I cheated a bit on this one - it wouldn't have been a valid > bit sequence for a GBK decoder, which any sane guessing algorithm would > have detected... but still, I think you get the point. > > So, people, use Unicode (the "universal charset") encoded as UTF-8 for > everything - and maybe in a few years we can all forget about all this > charset/encoding mess :) > That explains a lot, thanks. > > P.S.: I used Unicode/UTF-8 in this mail (and of course it's specified in > the mail's header), otherwise it wouldn't even have been possible to put > both Chinese characters and umlauts in one mail. > At least that looks hopeful for the future ;-) Anne -- New to KDE4? - get help from http://userbase.kde.org Just found a cool new feature? Add it to UserBase
Attachment:
signature.asc
Description: This is a digitally signed message part.
___________________________________________________ This message is from the kde mailing list. Account management: https://mail.kde.org/mailman/listinfo/kde. Archives: http://lists.kde.org/. More info: http://www.kde.org/faq.html.