Greg Stark <gsstark@xxxxxxx> writes: > Tom Lane <tgl@xxxxxxxxxxxxx> writes: >> If that does change the results, it indicates you've got strings which >> are bytewise different but compare equal according to strcoll(). We've >> seen this and other misbehaviors from some locale definitions when faced >> with data that is invalid per the encoding the locale expects. > There are plenty of non-bytewise-identical strings that do legitimately > compare equal in various locales. Does the hash code hash strxfrm or the > original bytes? I think you are jumping to conclusions. I have not yet seen it demonstrated that any locale definition in use in-the-wild intends to compare nonidentical strings as equal. On the other hand, we have seen plenty of cases of strcoll simply failing (delivering results that are not even self-consistent) when faced with data it considers invalid. I notice that the SUS permits strcoll to set errno if given invalid data: http://www.opengroup.org/onlinepubs/007908799/xsh/strcoll.html We are not currently checking for that, but probably we should be. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org