Search Postgresql Archives

Re: [HACKERS] 'a' == 'a '

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Josh Berkus wrote:

Dann,

I think that whatever is done ought to be whatever the standard says.
If I misinterpret the standard and PostgreSQL is doing it right, then
that is fine.  It is just that PostgreSQL is very counter-intuitive
compared to other database systems that I have used in this one
particular area.  When I read the standard, it looked to me like
PostgreSQL was not performing correctly.  It is not unlikely that I read
it wrong.

AFAIT, the standard says "implementation-specific".   So we're standard.

The main cost for comparing trimmed values is performance; factoring an rtrim into every comparison will add significant overhead to the already CPU-locked process of, for example, creating indexes. We're looking for ways to make the comparison operators lighter-weight, not heavier.
If I understand the spec correctly, it seems to indicate that this is specific to the locale/character set. Assuming that the standard doesn't have anything to do with any character sets, it should be possible to make this available for those who want it as an initdb option. Whether or not this is important enough to offer or not is another matter.

Personally my questions are:

1)  How many people have been bitten by this badly?
2)  How many people have been bitten by joins that depend on padding?

Personally, unlike case folding, this seems to be an area where a bit of documentation (i.e. all collation sets have are assumed to have the NO PAD option in the SQL standard) would be sufficient to answer to questions of standards-compliance.

My general perspective on this is that if trailing blanks are a significant hazard for your application, then trim them on data input. That requires a *lot* less peformance overhead than doing it every time you compare something.
In general I agree. But I am not willing to jump to the conclusion that it will never be warranted to add this as an initdb option. I am more interested in what cases people see where this would be required. But I agree that the bar is much higher than it is in many other cases.

Best Wishes,
Chris Travers
Metatron Technology Consulting

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux