Invalid byte sequence for encoding "UTF8": 0xedbebf

BRUSSER Michael <Michael.BRUSSER@xxxxxxx> · Wed, 15 Jun 2011 20:18:27 +0000

This is a follow-up on my previous message 
http://archives.postgresql.org/pgsql-general/2011-06/msg00054.php

I think I have now some understanding of what’s causing the problem, but I don’t have a good solution, instead more questions.
The release notes for v8.1 at 
http://www.postgresql.org/docs/current/interactive/release-8-1.html
make a good suggestion on using iconv to convert the plain-text dump file into utf8
On Linux this did not work, the input and output file were identical. The iconv on Solaris  refused to open the input file
(probably too big),  although it worked with a chunk of it and reported conversion error.

Unless there’s no other options I don’t want to use sed or break file into pieces, if possible, I would prefer to identify the bad records on the database. 

I tried SELECT with everything  I could think of:  ~*, SIMILAR TO, and the likes of them, but I never got it right.

Is there a way to find the records with the text field containing Unicode bytes “0xedbebf”?
Unfortunately this is a very old version 7.3.10

Thank you.
Michael.

This email and any attachments are intended solely for the use of the individual or entity to whom it is addressed and may be confidential and/or privileged.

If you are not one of the named recipients or have received this email in error, 

(i) you should not read, disclose, or copy it,

(ii) please notify sender of your receipt by reply email and delete this email and all attachments,

(iii) Dassault Systemes does not accept or assume any liability or responsibility for any use of or reliance on this email.

For other languages, go to http://www.3ds.com/terms/email-disclaimer