Search Postgresql Archives

Re: Case Insensitive Comparison with Postgres 12

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



	Igal Sapir wrote:

> > Out of curiosity is there a eason not to use the citext type for th?
> >
> >
> Using the collation seems like a much cleaner approach, and I trust ICU to
> do a better job at comparing strings according to language rules etc.

One notable difference between citext and case-insensitive collations
by ICU is that the latter recognizes canonically equivalent sequences
of codepoints [1] as equal, while the former does not.

For instance:

=# CREATE COLLATION ci (locale='und@colStrength=secondary', 
      provider='icu', deterministic=false);

=# SELECT E'E\u0302TES'::citext = 'Êtes'::citext AS "citext-equal", 
	  E'E\u0302TES' = 'Êtes' collate "ci" AS "ci-equal";
 citext-equal | ci-equal 
--------------+----------
 f	      | t

Another significant difference is that building or rebuilding an index on a
text column with a CI collation appears to be way faster than with citext
(I've seen 10:1 ratios, but do your own tests).

On the minus side, substring matching with LIKE or other methods
is not possible with CI collations whereas it does work with citext.


[1] https://en.wikipedia.org/wiki/Unicode_equivalence


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite






[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux