> Hi Tao Wang, > > My test.cpp source is UTF-8 with BOM. > > If I compile it like this... > > g++ -x c++ <(xxd -g 1 -s 3 test.cpp | xxd -g 1 -s -3 -r) -o a.out > > ... that strips out the first three bytes at the beginning. For test.cpp, this happens to be the BOM (ef bb bf) at the beginning. > > You'd may want to create a little 'stripBOM' program that behaves like 'cat', but gobbles the BOM if present. > > Or you could use awk, sed, perl, or your favorite-text-munging-tool-of-choice to perform the same conversion. I just used xxd because it was quick, for illustrative purposes. (There's probably a more suitable unix tool than xxd for this kind of cat-with-offset, but you'd want something that filters out BOM rather than always offsetting.) > > HTH, > --Ejlay > Hi Eljay and Tao Wang, I have experienced the same problem working in a multi-platform environment with a shared repository. In my case the source files have no BOM (they are stored in the server using the Windows machines' native encoding), so my solution was to add -finput-charset=WINDOWS-1252 to gcc's command line. Unfortunately, it seems like iconv has no way to insert/remove the BOM, so Tao Wang is out of luck. Eljay's solution isn't always viable either, because if the source file #includes a header with the BOM the compilation fails. I think there are two possible ways out: 1) Automatically execute a conversion command (like uconv --remove-signature) at checkouts/commits 2) Install a modified libiconv with an additional character set "UTF8-BOM" Best regards Dario