Nis Jorgensen wrote: > Oliver A. Rojo wrote: >> how do you fix your original db? >> > > Since I had only 3 occurrences of the error, I used > hand-crafted update statements. The fact that the replacement > for the invalid characters was constant and plain ascii made > this very easy. > > If you have many occurrences of the error, or if you need to > do replacement based on the invalid bytes, things become > trickier. You might even be better of working on the dump > file directly using perl/<favourite scripting language> I had the exact same problem with my upgrade - and a lot more than just a couple of occurences. The solution is quite easy however, so if you're prepared to simply eliminate the offending bytes, you'll find that iconv will be a very fast solution. However at least on my systems (Debian Sarge) iconv didn't like my >5GB sized dump files. So in order to successfully reimport the dumps, I had to "split --line-bytese0m" the SQL-file, pass the parts through iconv -c -f UTF8 -t UTF8 and concatenate them back into one file again. There were no more errors on feeding the dump back into psql and I didn't come across any missing data during my tests, so this has definitely done the trick for me. You should be aware that this will simply omit the illegal byte sequences from the dump. So if you've got some string "foo[non-UTF-8-bytes]bar", it will be converted to a simple "foobar" in the result. So if you really need to keep things 100% accurate, you'll have to actually identify each of these byte-sequences, then find the corresponding UTF-8 representation and use some search&replace-scripting on the dump before reloading it. Kind regards Markus