On Fri, Feb 27, 2015 at 9:03 AM, Adrian Klaver <adrian.klaver@xxxxxxxxxxx> wrote: > On 02/27/2015 06:39 AM, Melvin Call wrote: >> >> On 2/26/15, Vick Khera <vivek@xxxxxxxxx> wrote: >>> >>> On Thu, Feb 26, 2015 at 9:50 PM, Melvin Call <melvincall979@xxxxxxxxx> >>> wrote: >>> >>>> Montreal where the e is an accented e. The output ends up in the text >>>> file >>>> as >>>> Montr\xe9al, where the xe9 is a single character. When I try to copy >>>> that >>>> into >>>> my PostgreSQL table, I get an error "ERROR: invalid byte sequence for >>>> encoding >>>> >>> >>> Character code E9 is not UTF8. Don't tell Postgres you're importing UTF8 >>> if >>> you're not. >>> >> Thank you Vic, adding the ENCODING 'latin1' option to the COPY command >> worked >> perfectly. >> >> If you don't mind a follow up to your reply, I have tried to understand >> the >> different character sets and collations, but I guess I still have a lot to >> learn. Your suggestion did not even come close to crossing my mind because >> the >> MySQL table and database are encoded in UTF8. I assume the conversion to >> latin1 >> happened because I was putting the MySQL query output into a locally >> stored >> text file? Regardless, can you point me to some reading that would have >> clued >> me in that e9 is not a UTF8 character? Or is the clue the fact that it was >> not >> preceeded with 0x00? > > > For UTF8 characters see here: > > http://www.utf8-chartable.de/ Thank you for the link. Bookmarked. > > > For the MySQL part, you are going to detail how you got the data out? This is in preparation of moving away from MySQL. I inherited this MySQL database and it is in horrible shape, no referential integrity, no constraints other than arbitrarily chosen VARCHAR lengths, no indexes, and inconsistent entity and attribute naming. I have to pull the fields out through a query that is being redirected to a local file, as opposed to a dump, because I'm having to filter out a lot of useless rows (TestCity in a production system!?). I just realized I could have put the usable data into similar tables and then used mysqldump with the encoding specified, but no need now. I have my extraction and import done, so this should be the last I need to touch the MySQL system. > >> >> Regards, >> Melvin >> >> > > > -- > Adrian Klaver > adrian.klaver@xxxxxxxxxxx -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general