http://en.wikipedia.org/wiki/Specials_%28Unicode_block%29#Replacement_character
You could replace that sequence with the correct 0xFFFD sequence with `sed` for example (if using a plaintext dump format).
We upgrading some old database (7.3.10 to 8.4.4). This involves running pg_dump on the old db
and loading the datafile to the new db. If this matters we do not use pg_restore, the dump file is just sourced with psql,
and this is where I ran into problem:
psql: .../postgresql_archive.src/... ERROR: invalid byte sequence for encoding "UTF8": 0xedbebf
HINT: This error can also happen if the byte sequence does not match the encoding
expected by the server, which is controlled by "client_encoding".
The server and client encoding are both Unicode. I think we may have some copy/paste MS-Word markup
and possibly other odd things on the old database. All this junk is found on the ‘text’ fields.
I found a number of related postings, but did not see a good solution. Some folks suggested cleaning the datafile prior to loading,
while someone else did essentially the same thing on the database before dumping it.
I am looking for advice, hopefully the “best technique” if there is one, any suggestion is appreciated.
Thanks,
Michael.
This email and any attachments are intended solely for the use of the individual or entity to whom it is addressed and may be confidential and/or privileged.
If you are not one of the named recipients or have received this email in error,
(i) you should not read, disclose, or copy it,
(ii) please notify sender of your receipt by reply email and delete this email and all attachments,
(iii) Dassault Systemes does not accept or assume any liability or responsibility for any use of or reliance on this email.
For other languages, go to http://www.3ds.com/terms/email-disclaimer