RE: About Unicode IVS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



thank you for your reply.

In SQL Server, the variant character selector is treated as one character with two characters. The collation order is Japanese_XJIS_140_CS_AS_KS_WS_VSS_UTF8.

Moto.

-----Original Message-----
From: Tom Lane <tgl@xxxxxxxxxxxxx> 
Sent: Tuesday, March 29, 2022 7:26 PM
To: Holger Jakobs <holger@xxxxxxxxxx>
Cc: pgsql-admin@xxxxxxxxxxxxxxxxxxxx; n2029@xxxxxxxxxxxxx
Subject: Re: About Unicode IVS

Holger Jakobs <holger@xxxxxxxxxx> writes:
> It's totally correct that the two characters are still two characters.
> You would have to normalize the string first, so that the combination 
> becomes one character.

Yeah.  In principle the normalize() function ought to do this for you.  But it doesn't seem to shorten the given example for me; I'm not sure if that means the example is incorrect, or if it's a bug in normalize().

u8=# select octet_length(U&'\+008FBA' || U&'\+0E0102');  octet_length
--------------
            7
(1 row)

u8=# select octet_length(normalize(U&'\+008FBA' || U&'\+0E0102'));  octet_length
--------------
            7
(1 row)

			regards, tom lane







[Index of Archives]     [Postgresql Home]     [Postgresql General]     [Postgresql Performance]     [Postgresql PHP]     [Postgresql Jobs]     [PHP Users]     [PHP Databases]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Yosemite Forum]

  Powered by Linux