Search Postgresql Archives

Re: Index optimization ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Bo Lorentsen wrote:
Greg Stark wrote:
If Postgres used an index it would call odd(), which would return 1 because
it's the first time, and then Postgres would go look up the rows where col is
1 and return all of them. That's a very different behaviour from if the index
isn't used. If all the records have col=1 then you're getting all of the
records instead of half of them. If col=0 then you're getting none of them
instead of half of them.
This is the differance between "volatile" and "stable" functions, but not the answer to why an index lookuo are per query and not per row, is it not ?

Because the _whole_ _point_ of an index is to find matching rows _without_ scanning the whole table. IF you have to look at every row
anyway, then just might as well to an sequential scan.


The performance boost an index gives you comes from that very fact that you can avoid looking most of the rows. You only have to traverse the index tree from the root to the node(s) you are looking for. If you have an unique index, this means you have to traverse (roughly) log2(n) nodes, if your table has n rows. (If your table has 2^32, or about 4 billion rows, you just need to do 32(!) comparisons when walking the index tree, but whould need to do at worst 2^32 if you sequentially scanned the table.

_But_ the "downside" is that you skipped rows. You just didn't look at them _ever_. You don't even _know_ how many rows you skipped (Thats why count(*) on a hughe table is slow). So you _can't_ use an index for a where-condition that has to inspect every row. But, since the SQL-spec requires you to call volatile functions _for_every_row_, you _can't_ use
an index in your case.


You can, howevery, accelerate something like "where f in (1,2,3,4)". You just scan the index 4 times, each time for a different value. Of course,
if the number of values becomes larger and larger, there is a point where it's more efficient to do a sequential scan _once_, instead of a few tousand index scans (depends on the number of rows in the table).
The postgres optimizer tries to estimate this, and will switch to an seq-scan, if it would have to do too many index lookups.


Your example (the one with currval()) would translate to a IN-clause with about as many entries in the IN-list is you have rows in the table (Because the function has to be called _for_ _every_ _row_). Clearly,
it's not feasable to use an index scan here - it would be slower than a seq scan.


greetings, florian pflug

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux