Albe Laurenz <laurenz.albe@xxxxxxxxxx> writes: > While researching a problem with a different database system, > I came across the following in the SQL standard ISO/IEC 9075-2:2003, > Section 8.2 (<comparison predicate>), General Rules: > 3) The comparison of two character strings is determined as follows: > a) Let CS be the collation as determined by Subclause 9.13, > "Collation determination", for the declared types of the > two character strings. > b) If the length in characters of X is not equal to the length > in characters of Y, then the shorter string is effectively > replaced, for the purposes of comparison, with a copy of itself > that has been extended to the length of the longer string by > concatenation on the right of one or more pad characters, > where the pad character is chosen based on CS. > If CS has the NO PAD characteristic, then the pad character is > an implementation-dependent character different from > any character in the character set of X and Y that collates > less than any string under CS. > Otherwise, the pad character is a <space>. The PAD case is specifying the way that CHAR(n) comparison should work. (We don't expose the PAD/NO PAD distinction in any other way than CHAR vs VARCHAR/TEXT types.) AFAICS, the NO PAD case is ignorable BS: they are basically specifying implementation not semantics there, and in a way that is totally brain-dead. There isn't necessarily any such character as the one they blithely posit. Moreover, the whole description seems to assume that string comparison is single-pass left-to-right, which has little to do with any modern collation specification. We just rely on strcmp to decide that shorter strings are "less" than longer ones, which is the point of this spec AFAICT. Note that we don't actually do CHAR(n) comparison like that either, but instead choose to strip trailing spaces before the comparison. In any case, the most significant word in that whole paragraph is "effectively", which means you can do it however you want as long as you get an equivalent comparison result. > That would effectively mean that 'a'='a ' is TRUE for > all character string types. In the PAD case, yes. Else no. > Of the DBMS I tested, Microsoft SQL Server and MySQL gave me > that very result, while PostgreSQL and Oracle gave me FALSE. This probably has more to do with what these systems think the data type of an undecorated literal is, than with whether they do trailing-space-insensitive comparison all the time. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general