On Fri, Sep 21, 2012 at 09:21:36AM +0800, Craig Ringer wrote: > On 09/20/2012 11:44 PM, Leif Biberg Kristensen wrote: > > Torsdag 20. september 2012 16.56.16 skrev Alan Millington : > >>psql". But how am I supposed to remove the byte order mark from a UTF8 > >>file? I thought that the whole point of the byte order mark was to tell > >>programs what the file encoding is. Other programs, such as Python, rely > >>on this. > > > >http://en.wikipedia.org/wiki/Byte_order_mark > > > >While the Byte Order Mark is important for UTF-16, it's totally irrelevant to > >the UTF-8 encoding. > > I strongly disagree. The BOM provides a useful and standard way to > differentiate UTF-8 encoded text files from the random pile of > encodings that any given file could be. Use of the BOM in UTF-8 causes a host of display and interoperability problems, and is considered by many to be a broken practice. It's also pointless since there are no byte ordering issues with UTF-8. Best to not use it at all. In any case, the BOM byte sequence does not unambiguously identify UTF-8; it's equally valid for 8-bit charsets, so an external means of specifying the encoding is preferable and more robust. Regards, Roger -- .''`. Roger Leigh : :' : Debian GNU/Linux http://people.debian.org/~rleigh/ `. `' schroot and sbuild http://alioth.debian.org/projects/buildd-tools `- GPG Public Key F33D 281D 470A B443 6756 147C 07B3 C8BC 4083 E800 -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general