Re: inserts bypass encoding conversion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"James Pang (chaolpan)" <chaolpan@xxxxxxxxx> writes:
> So,  insert into values(chr(226)||chr(128)||chr(166)) actually got stored in database with LATIN1 with single byte sequence, but when query select * from testutf8, it got converted to UTF8 three byte sequence first ? 

There are no LATIN1 characters that have longer than 2-byte UTF8
representations, so no.

I think your fundamental misunderstanding is supposing that this:

	chr(226)||chr(128)||chr(166)

produces something equivalent to the UTF8 sequence 0xe2 0x80 0xa6.
It will not, no matter which server encoding you are dealing with.
It will produce something that is three separate characters
according to the server encoding.  In LATIN1, that could well be
the byte sequence 0xe2 0x80 0xa6, but *that byte sequence does not
mean the same thing that it would mean in UTF8 encoding*.

You also seem not to grasp the fact that an encoding conversion
will happen between your client and the server if client_encoding
is different from server_encoding.  Because of that, the output of
a SELECT command doesn't prove much of anything here.

			regards, tom lane






[Index of Archives]     [Postgresql Home]     [Postgresql General]     [Postgresql Performance]     [Postgresql PHP]     [Postgresql Jobs]     [PHP Users]     [PHP Databases]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Yosemite Forum]

  Powered by Linux