On 7/1/06, Martijn van Oosterhout <kleptog@xxxxxxxxx> wrote:
On Fri, Jun 30, 2006 at 07:29:12PM +0200, Tomi NA wrote: > If I sound harsh, please excuse me, but I feel like I'm the only one > who thinks these encoding problems (collation, upper/lowercase, > multiple languages in a single database) are serious...nobody seems to > share the sentiment. Ah well... I agree with you, however the resistance (AFAICS) comes mostly from the fact that we would be depending on an external library to do it. I don't think postgres should try doing it itself, given that the unicode character databases are quite large by themselves. Alternativly, the postgres group could produce a customised version of ICU that's smaller (the website has details about how). But any case, this problem will need to be addressed at some point.
Basically, it comes down to three possibilities, doesn't it: 1.) use an existing library 2.) write a pgsql specific implementation 3.) forget about it and tend to other issues Personally, I don't really care if it's 1) or 2): I'm just afraid it's going to be 3). Is this a licencing issue (with regard to ICU beeing under the IBM public licence)? A plugin architecture (to get rid of licencing headaches) issue? Are there any other libraries that might do the job? To be perfectly honest, I've had to tackle so many problems with encodings during the years I'd make it punishable by law to use anything *but* UTF...but I'm not president of the Galaxy yet, Zaphod is. (-: t.n.a.