On Wed, Apr 23, 2008 at 04:35:04PM +0200, Tim Tassonis wrote: > Ok, let me put it in another way. If UTF-8 is chosen at initdb, only > UTF-8 databases can be created, if C is chosen, you can specify > different encodings (UTF-8, LATIN1 etc) for each database. > > As I understood now, sorting will then still be in C style and not in > the locale specific way. Which leads me to the following questions: > > If specifying a characterset different from the default locale for a > database is such a bad idea, why is it possible at all? It isn't possible, that's the point. What is possible is that client can use any encoding they like to talk to the server, but the server will store and manage it all in one. What locale C means "I'm an encoding wizard and will ensure all my programs can handle all the encodings I want to use, because I understand the database will treat everything I send as ASCII bytes no matter what encoding the clients say it is". > From how I understand you, if I wanted a postgres server machine > supporting databases with different charsets, I'm advised to initialise > one cluster per locale. If you want to control the *storage* charset, yes. If you just want clients to think it's a LATIN9 DB, doing a: ALTER DATABASE foo SET client_encoding=latin9; > If specifying a characterset different from the default locale for a > database is not a bad idea, why does the default install forbid me to do > exactly this? It is a bad idea, because most normal the C library can only handle one encoding at a time. Locale C is a backdoor because it has system independant semantics and does not require libc. It's also not what people usually want, and so not recommended. Have a nice day, -- Martijn van Oosterhout <kleptog@xxxxxxxxx> http://svana.org/kleptog/ > Please line up in a tree and maintain the heap invariant while > boarding. Thank you for flying nlogn airlines.
Attachment:
signature.asc
Description: Digital signature