Re: converting databases form SQL_ASCII to UTF8

Geoffrey Myers <lists@xxxxxxxxxxxxxxxxxxxxx> · Tue, 03 May 2011 14:20:21 -0400

Jasen Betts wrote:
On 2011-04-22, Geoffrey Myers <geof@xxxxxxxxxxxxxxxxxxxxx> wrote:
Vick Khera wrote:
On Fri, Apr 22, 2011 at 11:00 AM, Geoffrey Myers 
<lists@xxxxxxxxxxxxxxxxxxxxx <mailto:lists@xxxxxxxxxxxxxxxxxxxxx>> wrote:

    Here's our problem.  We planned on moving databases a few at a time.
    Problem is, there is a process that pushes data from one database to
    another.  If this process attempts to push data from a SQL_ASCII
    database to a new UTF8 database and it has one of these characters
    mentioned above, the process fails.

The database's enforcement of the encoding should be the last layer that 
does so.  Your applications should be enforcing strict utf-8 encoding 
from start to finish.  Once this is done, and the old data already in 
the DB is properly encoded as utf-8, then there should be no problems 
switching on the utf-8 encoding in postgres to get that final layer of 
verification.
Totally agree.  Still, the question remains, why not leave it as SQL_ASCII?

perhaps you want sorted output in some locale other than 'C'?
or maybe want to take a substring in the database...

utf8 in SQL-ASCII is just a string of octets

utf8 in a utf8 database is a string of unicode characters.

We finally have a solution in place. A bug in my code was making the 
problem bigger then it really is.  Gotta love those bugs.

--
Until later, Geoffrey

"I predict future happiness for America if they can prevent
the government from wasting the labors of the people under
the pretense of taking care of them."
- Thomas Jefferson

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general