Thanks Tom, for your reply.
Tom Lane wrote:
Carlos Moreno <moreno_pg@xxxxxxxxxxx> writes:
Why is it that the database
cluster is resrticted to a single locale (or single set of locales) instead
of being configurable on a per-database basis?
Because we depend on libc's locale support, which (on many platforms)
isn't designed to switch between locales cheaply [...]
This stuff is certainly far from ideal, but the amount of work involved
to fix it is daunting; see many past pg-hackers discussions.
Fair enough --- and good to know.
2) On the same token (more or less), I have a test database, for which
I ran initdb without specifying encoding or locale; then, I create a
database with UTF8 encoding.
There's no such thing as "you didn't specify a locale". If you didn't
specify one on the initdb command line, then it was taken from the
environment. Try "show lc_collate" and "show lc_ctype" to see what
got used.
Yes, that's what I meant --- I meant that I did not use the --locale or
-E command-
line switches for the initdb command. Both lc_ctype and lc_collate show
en_US.UTF-8
I try lower of a string that
contains characters with accents (e.g., Spanish or French characters),
and it works as it should according to Spanish or French rules --- it
returns a string with the same characters in lowecase, with the same
accent. Why did that work? My Linux machine has all en_US.UTF-8
locales, and en_US is not even aware of characters with accents,
You sure? I'd sort of expect a UTF8 locale to know this stuff anyway.
In any case, Postgres doesn't know anything about case conversion
beyond what toupper/tolower tell it, so your experimental result is
sufficient proof that that locale includes these conversions.
Are you sure there's nothing about the way PostgreSQL interacts with C
conversion functions? I ask because, as part of a "sanity check", I
repeated
the tests --- now with two machines; one that has PG 8.1.4, and the
other one
has 7.4.14, and they behave differently.
The one that does the case conversion "correctly" (read: as I expect it
as per
Spanish or French rules) is 8.1.4 with en_US locale (LC_CTYPE and
LC_COLLATE both showing en_US.UTF-8). PG 7.4.14, *even with
locale es_ES*, does not do the case conversion (characters with accent
or tilde are left untouched).
I wonder if someone could shed some light on this little mystery....???
Perhaps to add more confusion to my experimental/informal tests, PG 8.1.4
is running on a FC4 AMD64 X2 box (the command "locale" at the shell
prompt shows all en_US.utf8), and PG 7.4.14 is running on a laptop with
FC5 on an Intel Celeron M (the command locale shows exactly the same
in that case). Does this perhaps account for the difference?
Thanks,
Carlos
--