Re: Weird behaviour on a join with multiple keys

Tom Lane <tgl@xxxxxxxxxxxxx> · Fri, 09 Mar 2007 18:04:12 -0500

Charlie Clark <charlie@xxxxxxxxxxxxxx> writes:
> Am 09.03.2007 um 16:15 schrieb Tom Lane:
>> There's your problem right there.  The string comparison routines are
>> built on strcoll(), which is going to expect UTF8-encoded data because
>> of the LC_COLLATE setting.  If there are any high-bit-set LATIN1
>> characters in the database, they will most likely look like invalid
>> encoding to strcoll(), and on most platforms that causes it to behave
>> very oddly.  You need to keep lc_collate (and lc_ctype) in sync with
>> server_encoding.

> That does indeed seem to have been the problem even though the  
> examples I was looking at were all using plain ASCII characters. Glad  
> to know it wasn't a bug and to have learned something new.

Well, it *is* a bug: we really shouldn't let you select incompatible
locale and encoding settings.  This gotcha has been known for a long
time, but it's not clear that there's a bulletproof, portable way to
determine which encoding a particular locale setting implies ...

			regards, tom lane