It also might be a big/little endian problem, although I always thought that was platform specific, not locale specific. Try the UCS-2-INTERNAL and UCS-4-INTERNAL codepages in iconv, which should use the two-byte or four-byte versions of UCS encoding using the system's default endian setting. There's many Unicode codepage formats that iconv supports: UTF-8 ISO-10646-UCS-2 UCS-2 CSUNICODE UCS-2BE UNICODE-1-1 UNICODEBIG CSUNICODE11 UCS-2LE UNICODELITTLE ISO-10646-UCS-4 UCS-4 CSUCS4 UCS-4BE UCS-4LE UTF-16 UTF-16BE UTF-16LE UTF-32 UTF-32BE UTF-32LE UNICODE-1-1-UTF-7 UTF-7 CSUNICODE11UTF7 UCS-2-INTERNAL UCS-2-SWAPPED UCS-4-INTERNAL UCS-4-SWAPPED Gee, didn't Unicode just so simplify this codepage mess? Remember when it was just ASCII, EBCDIC, ANSI, and localized codepages? -- Brandon Aiken CS/IT Systems Engineer -----Original Message----- From: pgsql-general-owner@xxxxxxxxxxxxxx [mailto:pgsql-general-owner@xxxxxxxxxxxxxx] On Behalf Of Arnaud Lesauvage Sent: Wednesday, November 22, 2006 12:38 PM To: Arnaud Lesauvage; General Subject: Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem Alvaro Herrera a écrit : > Arnaud Lesauvage wrote: >> Alvaro Herrera a écrit : >> >Arnaud Lesauvage wrote: >> > >> >>mydb=# SET client_encoding TO LATIN9; >> >>SET >> >>mydb=# COPY statistiques.detailrecherche (log_gid, >> >>champrecherche, valeurrecherche) FROM >> >>'E:\\Production\\Temp\\detailrecherche_ansi.csv' CSV; >> >>ERROR: invalid byte sequence for encoding "LATIN9": 0x00 >> >>HINT: This error can also happen if the byte sequence does >> >>not match the encoding expected by the server, which is >> >>controlled by "client_encoding". >> > >> >Huh, why do you have a "0x00" byte in there? That's certainly not >> >Latin9 (nor UTF8 as far as I know). >> > >> >Is the file actually Latin-something or did you convert it to something >> >else at some point? >> >> This is the file generated by DTS with "ANSI" encoding. It >> was not altered in any way after that ! >> The doc states that ANSI exports with the local codepage >> (which is Win1252). That's all I know. :( > > I thought Win1252 was supposed to be almost the same as Latin1. While > I'd expect certain differences, I wouldn't expect it to use 0x00 as > data! > > Maybe you could have DTS export Unicode, which would presumably be > UTF-16, then recode that to something else (possibly UTF-8) with GNU > iconv. UTF-16 ! That's something I haven't tried ! I'll try an iconv conversion tomorrow from UTF16 to UTF8 ! -- Arnaud ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match