Re: unicode and =

"Grant Morgan" <grant@xxxxxxxxxxx> · Mon, 20 Jun 2005 17:50:55 +0900

I am not sure what locale I was running as I had not set it when doing initdb.
I created a new DB with --locale=en_US.utf8 -E UNICODE
and imported my data from original source (not copied from old DB) and still have the smae problem that UNICODE strings with double byte characters that are not equal get selected as equal.

to test things further
md5(h1.name)=md5(h2.name)
works and only matches equal values.
h1.name=h2.name
match un equal values.

Anyone have any other ideas? or is en_US.utf8 not a proper utf8 locale ( I got the name by doing locale -a )
I am not so concerned about sorting on this project just equality, but general solution would be apreciated.

Thanks,
Grant

On Mon, 20 Jun 2005 10:13:39 +0900, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:

"Grant Morgan" <grant@xxxxxxxxxxx> writes:
= is not working on a char(30) coloumn for me.
I want to find rows with equal name.
I have my database set to unicode.

I'll bet you are running the postmaster in a locale that isn't expecting
utf-8 encoding.  The locale and encoding have to match or you're going
to get very strange behavior.

			regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
     joining column's datatypes do not match