Hi everyone, GCC currently supports two options for dealing with the content of strings hardcoded in source files, -fexec-charset and -finput-charset (and -fwide-exec-charset, but let's keep that one aside for now). In my understanding, when GCC reads a source file from disk, it assumes the file to be in the "input charset" specified with -finput-charset, or in lack there of, in the locale's charset, or in lack there of, in UTF-8. The content is then transcoded to whatever internal charset GCC uses. This includes any ordinary string constants' contents. When GCC then creates an executable, the strings are transcoded from GCC's internal charset into the charset specified with -fexec-charset, or in lack there of, into UTF-8. So even if my source file is written in, say, UTF-32, any ordinary string literals end up as UTF-8 in the executable. If the above was incorrect, please correct me. I would appreciate it if you gave me a pointer where to read up the correct process then. Now comes the question. The above is true for ordinary string literals. The string literal in the following source code: int main() { const char* some_string = "Bärenstark"; return 0; } will thus always end up being transcoded to UTF-8 and stored as UTF-8 in the executable if -fexec-charset=UTF-8 is set and the input charset is set or detected correctly. If on the other hand I specify -fexec-charset=ISO-8859-1, it should be stored in the executable in *that* charset. Which effect does -fexec-charset have if the source code uses the new C++11 charset-aware literals? For example, if the source code looks like this: int main() { const char* some_string = u8"Bärenstark"; return 0; } u8 denotes a string encoded in UTF-8, so in my expectation, this string literal should *always* end up in UTF-8 in the final executable, i.e. the value of the option -fexec-charset should be ignored, especially if it is unset. However, even if I set -fexec-charset=ISO-8859-1, I would expect the string still to be in UTF-8 in the final executable, since there is an explicit request for UTF-8 in the source code (and GCC should probably emit a warning that this doesn't fit together well). Even more, this assumption should be true on all conformant C++ compilers, shouldn't it? Is this correct? Can an explanation of this be added to the documentation of the -fexec-charset commandline switch? Greetings Marvin -- Blog: http://www.guelkerdev.de PGP/GPG ID: F1D8799FBCC8BC4F