On 7/6/06, Tino Wildenhain <tino@xxxxxxxxxxxxx> wrote:
... >> Yes, its actually quite esay: you dump as you feel apropriate, >> then create the database with the encoding you want, >> restore w/o creating database and you are done. >> Restore sets the client encoding to what it actually was >> in the dump data (in your case latin-1) and the database >> would be utf-8 - postgres automatically recodes. No need >> for iconv and friends. >> >> Regards >> Tino >> > > First of all, thank you for your answer. However, I suspect I did not > understand your answer, since the commands I used were: > > 1) pg_dump -Ft -b -f dump.sql.tar database > 2) dropdb database > 3) createdb -E UNICODE database > 4) pg_restore -d database dump.sql.tar > > According to my experience, this produces a "double encoding". As you > can see, I hand-created the database, with the proper encoding. > However, when I reimported the database, the result was a latin1 > encoded in utf-8, rather than a pure utf-8. > > How my procedure was different with respect to yours? That was the correct way. I wonder if you have recoding support enabled? Did you build postgres yourself?
Support for recoding? I don't know... I compiled myself postgres, which is, BTW, 7.4.8. How can I check if auto recoding is enabled?
Latin-1 double encoded into utf-8 seems not like possible... utf-8 barfs on most latin-1 characters, current 8.1 is very picky about it. So maybe you can work with a small test table to find out what's going wrong here.
Yes, I will do... I understand postgresql in later release became much more "picky" about encoding.
(The changing of the client_enccoding setting in the backup is only needed in the case when you had data in the wrong encoding - like SQLAscii filled with latin-1 or something)
Ok, thanks! Regards Marco -- Marco Bizzarri http://notenotturne.blogspot.com/