On Thu, 27 Jan 2005 00:16:14 +0000, Vladimir S. Petukhov <vladimir@xxxxxxxxxx> wrote: > > > LC_COLLATE: ru_RU > > > LC_CTYPE: ru_RU > > > Name | Owner | Encoding > > > -----------+----------+---------- > > > testdb | postgres | UNICODE > > > And LIKE, ILIKE, ~ do not recognize upper/lower case.. > > > > What character encoding is implied by those LC_ settings on your machine? > > If it's different from the database encoding (here utf8) these things > > won't actually work right. > LANG=ru_RU.koi8r > LC_ALL=ru_RU.koi8r > But how it act on lower/upper cases? Client use utf-8 encoding... The client uses utf-8 encoding, so does server. Texts are stored using UTF-8. However when you call a lower() function from PostgreSQL it does more or less following: -- it retrieves text row from database. This text is in UTF-8 encoding. -- it calls strxfrm function upon this text. -- strxfrm function sees that current locale is ru_RU.koi8r -- strxfrm then takes utf-8 encoded text and treats it as koi8r -- strxfrm "skips over" characters it does not recognize (utf-8 chars) -- strxfrm returns transformed text -- PostgreSQL takes the resulting text, believing it is still in utf-8. In other words, probably only latin characters were subject to lower() functions, any "unknown" Russian UTF-8 characters were at best skipped. Please note that PostgreSQL does not do implicit utf8->koi8r->utf8 conversion while calling function lower(). AFAIK it does not even know (or care) if current locale setting ("ru_RU") is for different encoding than current database's. It is DB Admin's duty to make sure cluster locale (done in initdb) is compatible with database encoding (done in createdb). Regards, Dawid ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster