On Wed, Aug 29, 2012 at 12:43 PM, Bruce Momjian <bruce@xxxxxxxxxx> wrote: > On Wed, Aug 29, 2012 at 10:31:21AM -0700, Aleksey Tsalolikhin wrote: >> On Wed, Aug 29, 2012 at 9:45 AM, Merlin Moncure <mmoncure@xxxxxxxxx> wrote: >> > citext unfortunately doesn't allow for index optimization of LIKE >> > queries, which IMNSHO defeats the whole purpose. to the best way >> > remains to use lower() ... >> > this will be index optimized and fast as long as you specified C >> > locale for your database. >> >> What is the difference between C and en_US.UTF8, please? We see that >> the same query (that invokes a sort) runs 15% faster under the C >> locale. The output between C and en_US.UTF8 is identical. We're >> considering moving our database from en_US.UTF8 to C, but we do deal >> with internationalized text. > > Well, C has reduced overhead for string comparisons, but obviously > doesn't work well for international characters. The single-byte > encodings have somewhat less overhead than UTF8. You can try using C > locales for databases that don't require non-ASCII characters. To add: The middle ground I usually choose is to have a database encoding of UTF8 but with the C (aka POSIX) locale. This gives you the ability to store any unicode but indexing operations will use the faster C string comparison operations for a significant performance boost -- especially for partial string searches on an indexed column. This is an even more attractive option in 9.1 with the ability to specify specific collations at runtime. merlin -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general