Search Postgresql Archives

Re: MSSQL to PostgreSQL : Encoding problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Tony Caduto wrote:
Arnaud Lesauvage wrote:


I then try to import into PostgreSQL. The farther I can get is when using the UNICODE export, and importing it using a client_encoding set to UTF8 (I tried WIN1252, LATIN9, LATIN1, ...).
The copy then stops with an error :
ERROR: invalid byte sequence for encoding "UTF8": 0xff
État SQL :22021

The problematic character is the euro currency symbol.


Exporting from MS SQL server as unicode is going to give you full Unicode, not UTF8. Full unicde is 2 bytes per character and UTF8 is 1, same as ASCII.
You will have to encode the Unicode data to UTF8

Well, UTF8 is a minimum of one byte, but can be longer for non-ASCII characters. The idea being that chars below 128 map to ASCII. There's also UTF16 and I believe UTF32 with 2+ and four byte characters.

I have done this in Delphi using it's built in UTF8 encoding and decoding routines. You can get a free copy of Delphi Turbo Explorer which includes components for MS SQL server and ODBC, so it would be pretty straight forward to get this working.

The actual method in Delphi is system.UTF8Encode(widestring). This will encode unicode to UTF8 which is compatible with a Postgresql UTF8 database.

Ah, that's useful to know. Windows just doesn't have the same quantity of tools installed as a *nix platform.

I am sure Perl could do it also.

And in one line if you're clever enough no doubt ;-)

--
  Richard Huxton
  Archonet Ltd



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux