Search Postgresql Archives

Re: Mixing different LC_COLLATE and database encodings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 21, 2006 at 10:27:15AM +0900, Tatsuo Ishii wrote:
> If you consider to allow only UTF-16 or whatever encoding in backend,
> I will strongly against the idea. We Japanese need those encodings
> native support. Converting those encodings with Unicode everytime when
> backend and forntend have conversations will be serious performance
> hit. Moreover the converion is known as not being roundtrip safe, that
> means some information will be lost during the conversion. The another
> point would be on disk format. UTF-16 will require more storage than
> local encodings. Probably UTF-8 will require more.

I didn't say that we only support utf-16 in the backend, I said that
when doing comparisons in a non-C locale, you have to convert to UTF-16
to use ICU. If you don't want to use it, don't, it's not going to be
required at any point. Just like currently with Win32, if you use UTF-8
it has to be converted to UTF-16 prior to string comparison.

The only time any of this is required is *sorting* and if you have an
index defined it acts as a cache for the sorted values. Ofcourse
there's a tradeoff but unless you're sorting large datasets all day I
doubt it'll be noticable.

If you're not sorting, none of this is relevent to you.

> I have a feeling that ICU is good for applications, but is not for
> DBMSs.

I think providing a system where users are able to select out of a
large range of possible collation orders and if necessary specify their
own is a worthy goal. Look at the complaints we get now and then of
people who choose en_US as their locale and are surprised when it gives
them a dictionary sort.

ICU allows users to take an existing collation and tweak it if it
doesn't quite match their expectations. You think this is not useful
for a DBMS?

Have a nice day,
-- 
Martijn van Oosterhout   <kleptog@xxxxxxxxx>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment: signature.asc
Description: Digital signature


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux