On 7/6/06, Tino Wildenhain <tino@xxxxxxxxxxxxx> wrote:
Marco Bizzarri schrieb: > Hi all. > > Here is my use case: I've an application which uses PostgreSQL as > backend. Up to now, the database was encoded in SQL_ASCII or LATIN1. > Now, we need to migrate to UTF-8. > > What we tried, was to: > > 1) dump the database using pg_dump, in tar format (we had blob); > 2) modifying the result, using some conversion tool (like recode) > > > 3) destroying the old database > 4) recreating the database with UNICODE setting > 5) restoring the database using pg_restore > > The result was not what I expected. The pg_restore was using the > LATIN1 encoding to encode the strings, resulting in a LATIN1 encoded > in UTF-8... > > The problem lied in the toc.dat file, which stated that the client > encoding was LATIN1, instead of UTF-8. > > The solution in the end has been to manually modifying the toc.dat > file, substituting the LATIN1 string with UTF-8 (plus a space, since > the toc.dat is a binary file). > > Even though it worked for us, I wonder if there is any other way to > accomplish the same result, at least to specify the encoding for the > restore. Yes, its actually quite esay: you dump as you feel apropriate, then create the database with the encoding you want, restore w/o creating database and you are done. Restore sets the client encoding to what it actually was in the dump data (in your case latin-1) and the database would be utf-8 - postgres automatically recodes. No need for iconv and friends. Regards Tino
First of all, thank you for your answer. However, I suspect I did not understand your answer, since the commands I used were: 1) pg_dump -Ft -b -f dump.sql.tar database 2) dropdb database 3) createdb -E UNICODE database 4) pg_restore -d database dump.sql.tar According to my experience, this produces a "double encoding". As you can see, I hand-created the database, with the proper encoding. However, when I reimported the database, the result was a latin1 encoded in utf-8, rather than a pure utf-8. How my procedure was different with respect to yours? I will make some test with a sample database, and enabling the logging, so that I can understand the commands which are issued. Regards Marco -- Marco Bizzarri http://notenotturne.blogspot.com/