Re: converting databases form SQL_ASCII to UTF8

Jasen Betts <jasen@xxxxxxxxxx> · 3 May 2011 13:05:14 GMT

On 2011-04-22, Geoffrey Myers <geof@xxxxxxxxxxxxxxxxxxxxx> wrote:
> Vick Khera wrote:
>> On Fri, Apr 22, 2011 at 11:00 AM, Geoffrey Myers 
>> <lists@xxxxxxxxxxxxxxxxxxxxx <mailto:lists@xxxxxxxxxxxxxxxxxxxxx>> wrote:
>> 
>>     Here's our problem.  We planned on moving databases a few at a time.
>>     Problem is, there is a process that pushes data from one database to
>>     another.  If this process attempts to push data from a SQL_ASCII
>>     database to a new UTF8 database and it has one of these characters
>>     mentioned above, the process fails.
>> 
>> 
>> The database's enforcement of the encoding should be the last layer that 
>> does so.  Your applications should be enforcing strict utf-8 encoding 
>> from start to finish.  Once this is done, and the old data already in 
>> the DB is properly encoded as utf-8, then there should be no problems 
>> switching on the utf-8 encoding in postgres to get that final layer of 
>> verification.
>
> Totally agree.  Still, the question remains, why not leave it as SQL_ASCII?

perhaps you want sorted output in some locale other than 'C'?
or maybe want to take a substring in the database...

utf8 in SQL-ASCII is just a string of octets

utf8 in a utf8 database is a string of unicode characters.

-- 
ââ 100% natural

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general