Search Postgresql Archives

Re: lc_collate issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Cody Pisto <cpisto@xxxxxxxxx> writes:
> > If initdb was done with a C locale, and thus lc_collate and friends 
> > where all C, but the database and client encoding was set to UTF-8, 
> > would postgres convert data on the fly from UTF-8(storage) to ASCII for 
> > sorting or would things just blow up when a >1 byte character hit the mix?
> 
> No, C locale just sorts the bytes.  It won't "blow up".  Whether it will
> give you a sort ordering you like for multibyte characters is a
> different question.

Yup.

For example, LATIN1 part of UTF-8 (UNICODE) is physicaly ordered same
as ISO 8859-1. So if you see the order of ISO 8859-1 is "natural",
then the sort order of UTF-8 is ok as well. However the order of CJK
part of UTF-8 is totally different from the original charcater sets
(almost random), you need to use convert() for converting UTF-8 to
original encoding to get "natural" sort order. I don't think you are
interested in CJK part, though.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux