On Tue, Oct 31, 2006 at 11:47:56PM -0500, Tom Lane wrote: > Because we depend on libc's locale support, which (on many platforms) > isn't designed to switch between locales cheaply. The fact that we > allow a per-database encoding spec at all was probably a bad idea in > hindsight --- it's out front of what the code can really deal with. > My recollection is that the Japanese contingent argued for it on the > grounds that they needed to deal with multiple encodings and didn't > care about encoding/locale mismatch because they were going to use > C locale anyway. For everybody else though, it's a gotcha waiting > to happen. Could this paragraph be put into the docs and/or the FAQ, please ? Along with the recommendation that if you require multiple encodings for your databases you better had your OS locale configured properly for UTF8 and use UNICODE databases or do initdb with the C-locale. > This stuff is certainly far from ideal, but the amount of work involved > to fix it is daunting; see many past pg-hackers discussions. Here are a few data points from my Debian/Testing system in favour of not worrying too much about installed ICU size as it is being used by other packages anyways: libicu36 Reverse Depends: openoffice.org-writer * OOo openoffice.org-filter-so52 openoffice.org-core libxerces27 * Xerces XML parser (Apache camp) libboost-regex1.33.1 libboost-dbg icu Reverse Depends: libicu36 libicu36 libxercesicu26 * Xerces, again libxercesicu25 libicu28-dev libicu28 libicu21c102 icu-i18ndata icu-data libwine * Wine This, of course, does not decrease the work required to get this going in PostgreSQL. Thanks for the great work, Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346