Search Postgresql Archives

Re: Dumping/Restoring with constraints?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Phoenix Kiula wrote:
Thanks Andrew.

On the server (the DB to be dumped) everything is "UTF8".

On my home server (where I would like to mirror the DB), this is the output:


=# \l
            List of databases
   Name    |      Owner      | Encoding
-----------+-----------------+-----------
 postgres  | postgres        | SQL_ASCII
 pkiula    | pkiula_pkiula   | UTF8
 template0 | postgres        | SQL_ASCII
 template1 | postgres        | SQL_ASCII
(4 rows)



This is a fresh install as you can see. The database into which I am
importing ("pkiula") is in fact listed as UTF8! Is this not enough?


You said you're getting these errors:
ERROR:  invalid byte sequence for encoding "UTF8": 0x80

those 0x80 bytes are inside the mydb.sql file, you may find it easier to look for them there and identify the offending string(s). Try (on the linux machine):

zcat mydb.sql.gz | iconv -f utf8 > /dev/null

should tell you something like:

illegal input sequence at position xxx

BTW, 0x80 is usually found in windows encoding, such as windows-1250, where it stands for the EURO symbol:

echo -n "€" | iconv -t windows-1250 | hexdump -C
00000000  80                                                |.|
00000001


FYI, you *can* get non UTF-8 data from an UTF-8 database, if (and only if) your client encoding is something different (either because you explicitly set it so, or because of your client defaults).

Likewise, you can insert non UTF-8 data (such as your mydb.sql) into an UTF-8 database, provided you set your client encoding accordingly. PostgreSQL clients handle encoding conversions, but there's no way to guess (reliabily) the encoding of a text file.

OTOH, from a SQL_ASCII database you can get all sort of data, even mixed encoding text (which you need to fix somehow). If your mydb.sql contains data from a SQL_ASCII database, you simply know nothing about the encoding.

I have seen SQL_ASCII databases containg data inserted from HTTP forms, both in UTF-8 and windows-1250 encoding. Displaying, dumping, restoring that correctly is impossible, you need to fix it somehow before processing it as text.

.TM.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux