Search Postgresql Archives

Latin1 to UTF-8 ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I've set up a new CentOs server with PostgreSQL 8.2.4 and initdb'ed it with 
UTF-8.

Ok, and runs fine.

I have a problem with encodings, however. And mainly with the russian cyrillic 
characters.

When I testdumped some dbs from the old FC / Pg 8.0.2, all Latin1, I noticed 
that some of the dumps show in the Konqueror file browser as 'Plain Text 
Documents' and some as 'C++ Source Files'. Both have Latin1 as client 
encoding at the top of the files. Changing that gives errors, as expected.

Looking in to the plain text dumps I see all cyrillic characters as Р... 
and these go in display fine from the new server's UTF-8 environment.

Some of the 'C++' files have the cyrillics as 'îñåòèòåëåé'. Some have both 
'îñåòèòåëåé' and Р... and ofcourse the 'îñåò' characters come out wrong 
and unreadable to the browser. (not sure if you an see single quoted ones, 
but they look something like hebrew or similar) 

I have no idea what browsers / encodings or even keyboard layouts have been 
used when the data has been inserted by users through their web 
interfaces ...

I tried the -F p switch as the earlier version has no -E for dumps. Same 
output. Also with pg_dumpall.

I tried various encodings with iconv too.

So, what would be the proper way to convert the dumps to UTF-8 ? Or any other 
solution ? Any other tool to work with the problem files ?

BR,

Aarni
-- 
Aarni Ruuhimäki


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux