Hi Christian, > a UTF-8 character may (sometimes) be even 16bit of size; think about german umlauts, they get escaped somehow, and then follows the code for them. The Utf8Char class does not represent a full Unicode codepoint, it represents an 8-bit chunk. A Unicode codepoint is encoded as 1 to 4 8-bit UTF8 chunks. A Unicode codepoint is 21-bits. NOTE: ISO/IEC 10646 codepoint is 31-bits. I'm not concerned about ISO/IEC 10646. > This really can't work either. Did you get this compile? Sorry, copy-n-paste error. I did get something similar to compile. The example was supposed to be illustrative. > Sorry for not being much help more, but you could/shoult google for "encoding" > and have a look around for some other implementations / how they're doing it > (the en[/de]coding-way). I've Googled. I've spoken with various members of the C++ steering committee. I've spoken with different forums. I did get some useful pointers from Martin York of Symantec. I was directed to Standard C++ IOStreams and Locales by Langer and Kreft, which I haven't picked up yet but will shortly. Thanks, --Eljay