Thomas Mueller wrote: > Hello, > > I did a pg_dumpall of all my Pg 8.0.3 databases, removed 8.0, installed > 8.1 and tried to import the dump. One table in one database failed with: > > ERROR: invalid UTF-8 byte sequence detected near byte 0x83 > CONTEXT: COPY pwd_name, line 22428, column name: "t.tonnement" > > So I exported that database with 8.0 as Inserts to a text file and tried > to fix it using iconv, but that fails as well: > > # iconv -f UTF-8 -t UTF-8 dump.sql > dump-fixed.sql > iconv: illegal input sequence at position 2588882 > > How can I fix the sql script to import it? > I have Debian Linux 3.1. We have updated the 8.1.0 release notes to mention a fix: Some users are having problems loading UTF-8 data into 8.1.X. This is because previous versions allowed invalid UTF-8 byte sequences to be entered into the database, and this release properly accepts only valid UTF-8 sequences. One way to correct a dumpfile is to run the command <command>iconv -c -f UTF-8 -t UTF-8 -o cleanfile.sql dumpfile.sql</>. The <literal>-c</> option removes invalid character sequences. A diff of the two files will show the sequences that are invalid. <command>iconv</> reads the entire input file into memory so it might be necessary to use <application>split</> to break up the dump into multiple smaller files for processing. -- Bruce Momjian | http://candle.pha.pa.us pgman@xxxxxxxxxxxxxxxx | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073