Florian G. Pflug wrote:
Lets say, you have an query "select * from table where field = function()". Now, according to the sql-spec, you would have to
scan each row in "table", call the function "functio()", and compare the
result. If the result of the call to "function()" matches the value in "field", the you return the row, otherwise you don't return it.
This far I follow :-)
An index is a tree, where each node has two subnodes/children. Each node
representents a row of your table (or, rather, references a row - it contails only the value of the field you indexed). Additionally,
the value of the field of the "left" child (and the value of the field of its children, and so on) is always guaranteed to be smaller-or-equal to the value of field of the node itself, and the value
of the field of the "right" child is always guaranteed to be greater-or-equal to the value of the field of the node.
What you say, is based on the result of the evaluation, the executer will optimize or performe the index lookup, if the righthand is a constant, but it will perform a seq scan if the value is not known on the first query (volatile) ?
To me this sound like the indexes can't be performed per row, but only per query ? Or, PG is not type aware when maching indexes ?
Now, when doing an index lookup, you have to know which value to look for (lets say you look for 3). Then you look at the top node, and compare the value you are looking for to the value of the node. In our case, the node has a smaller value then the one we are looking for. Because we know that the left child of the toplevel node will have an even smaller value, we don't need to look at the left child at all. We just check the right child, and there we find our record with "field"=3.
But why can't we evaluate righthand and use this new row value (could still be 3, but not 'foobar' :-)) as a key to the index ?
_BUT_ this only works, because we knew for which value to look before we
started searching. If we the value we look for is constantly changing, our index lookup would return bogus results. So, if the value is unknown
_at_the_beginning_ of the search, you can't use the index, because the power of an index search comes from the idea to skip a whole subtree (in
our case the left one), because we _know_ it can't contain the value we are looking for.
But the "value" is unknow yes ... but the type of the value is not, this will not change from row to row only the value do this.
Functions marked "immutable" or "stable" is _guaranteed_ to never change their value during a select statement, or at least not in an unpredictable way. Thats why you can use return values of "immutable" or "stable" functions for an index search.
I understand that, I just can't see why an index lookup can't be used on "per row" basis.
Lets say, you are doing the following query
"select * from table where field1 = currval('seq')" and field2 = nextval('seq')
Now, the value of "currval('seq')" changes with every row visited. In your example, the value of "currval" is actually stable - but postgres has no way to know this. To use an index scan for your query, postgres
would need to know, that only a call to nextval can change the value of currval - but this, of course, is a quite difficult dependency for a
database to track.
But why can't the integer index for "field1" be used, with the "currval" result, even if it changes ? Are the index lookup only performed ones (asked this before ) per query ?
/BL
---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives?
http://archives.postgresql.org