On Mon, Nov 07, 2005 at 02:28:05PM +0100, Guido Neitzer wrote: > I think I was the one who asked. > > I worked on my locale problem on the weekend and was able to build a > LC_COLLATE file, that actually works with ISO locales, but not with > UTF-8 (50% progress ... ;-)). Guess the problem is that you have to import the entire Unicode database to make it work. I think the code is multibyte aware though, it's just that no-one has done the work. Disclaimer: I'm working with Linux/Glibc which has had proper collation for quite a while now so I have no real understanding of systems that don't have it. > When you test the UNIX utility "sort" on Mac OS X, you should be > aware, that the pre-installed version on Mac OS X ignores locales at > all ... :-( I had to install the gnu coreutils to get a sort that > works with locales, and this also fails on UTF-8 but works with ISO > encoding/collate - same as PG does. Nasty. > Now I'm not sure, whether my own LC_COLLATE file is not appropriate > for UTF-8 (why not?) or whether Mac OS X locale does not support > UTF-8 at all as you state. Hmm, I just went back to the source code (adv_cmds-79.1) and it looks like collations don't support UTF-8 at all. Or any multibyte encoding. > Will be cool to have locale support directly in PostgreSQL. Yeah, but seems a bit lame for an operating system to claim to support multibyte locales if it can't do collation on them. :( It supports everything but collation, so it's obviously not a priority. > So, just a quick question regarding a switch: is there a problem with > using ISO8859-15 for now, and do a switch later with dumping the data > and import it to a newer version which should then use UTF-8? Do I > need to do some conversion or how does this work? If you import as ISO8859-15 now, when you do the upgrade, simply set the client encoding to that and PostgreSQL will convert it all to UTF-8 during the load. Have a nice day, -- Martijn van Oosterhout <kleptog@xxxxxxxxx> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment:
pgpAxC2KBide1.pgp
Description: PGP signature