On Thu, 18 Nov 2004 11:08:38 +0100, Markus Wollny <markus.wollny@xxxxxxxxxxx> wrote: > Oleg, what exactly do you mean by "tsearch2 doesn't support unicode yet"? > It does seem to work fine in my database, it seems: > ./pg_controldata [mycluster] gives me > pg_control version number: 72 > [...] > LC_COLLATE: de_DE.UTF-8 > LC_CTYPE: de_DE.UTF-8 Correct me if I am wrong, but I think that UTF-8 is almost identical to ISO-8859-1 in binary form to ISO-8859-1. I mean, UTF-8 is ISO-8859-1 plus multibyte characters from other charsets. If I am correct, there is no difference for Tsearch2 between UTF-8 and ISO-8859-1, so German locales work fine. As for ISO-8859-2 and similar this will not be the case since there are multibyte characters. Oh, and a side-question -- is there any facility to "strip" charsets like ISO-8859-2 down to ASCII (not SQL_ASCII, I mean ASCII, only latin characters without accents, etc.)? I think it would be useful for me, since I intend to search through mail-like messages and many people are lazy (or accustomed to...) and don't use special chars but only their ASCII non-accented counterparts... Regards, Dawid ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend