On Fri, Sep 2, 2011 at 9:30 PM, Rural Hunter <ruralhunter@xxxxxxxxx> wrote: > Hi Kevin, > > I did another try with following additional changes based on our discussion: > 1. use the tcp connection > 2. turn off autovacuum > 3. turn off full_page_writes > > I could import more than 30G data in about 2 hours. That's totally > acceptable performance to me with the current server capability. There is a > minor issue though. I saw a few errors during the import: > ERROR: invalid byte sequence for encoding "UTF8": 0xe6272c > ERROR: invalid byte sequence for encoding "UTF8": 0xe5272c > ERROR: invalid byte sequence for encoding "UTF8": 0xe5272c > ERROR: invalid byte sequence for encoding "UTF8": 0xe5272c > ERROR: invalid byte sequence for encoding "UTF8": 0xe68e27 > ERROR: invalid byte sequence for encoding "UTF8": 0xe7272c > ERROR: invalid byte sequence for encoding "UTF8": 0xe5272c > ERROR: invalid byte sequence for encoding "UTF8": 0xe5a427 > > My data was exported from an UTF8 MySQL database and my pgsql db is also > UTF8. I got 8 errors above only with about 3 million records imported. The > strange thing is, I usually see the problematic SQL output in the log if > there is any error for that SQL so I have a chance to fix the data manually. > But for the errors above, I don't see any SQL logged. The pgsql log just > output error log same as above with no additional info: > 2011-09-01 11:26:32 CST ERROR: invalid byte sequence for encoding "UTF8": > 0xe6272c > 2011-09-01 11:26:47 CST ERROR: invalid byte sequence for encoding "UTF8": > 0xe5272c > 2011-09-01 11:26:53 CST ERROR: invalid byte sequence for encoding "UTF8": > 0xe5272c > 2011-09-01 11:26:58 CST ERROR: invalid byte sequence for encoding "UTF8": > 0xe5272c > 2011-09-01 11:26:58 CST ERROR: invalid byte sequence for encoding "UTF8": > 0xe68e27 > 2011-09-01 11:27:01 CST ERROR: invalid byte sequence for encoding "UTF8": > 0xe7272c > 2011-09-01 11:27:06 CST ERROR: invalid byte sequence for encoding "UTF8": > 0xe5272c > 2011-09-01 11:27:15 CST ERROR: invalid byte sequence for encoding "UTF8": > 0xe5a427 > > What could be the cause of that? MySQL probably has looser checking of proper UTF-8 encodings. -- Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-admin