Re: Index optimization ?

Tom Lane <tgl@xxxxxxxxxxxxx> · Sun, 16 Jan 2005 15:50:33 -0500

Bo Lorentsen <bl@xxxxxxxxxxx> writes:
> Tom Lane wrote:
>>	SELECT * FROM mytable WHERE random() < 0.1;
>> If we evaluated random() only once in this query, we would get either
>> all or none of the rows, clearly not the right answer.
>> 
> So if the random function was stable, you either get all or none, as et 
> gets executed only ones ?

No, you'd still end up with a seqscan, because this WHERE clause offers
no chance of matching an index, and we don't do anything special with
stable functions beyond trying to match them to index conditions.
But consider something like

	SELECT * FROM mytable WHERE keycol = int(random() * 1000);

where keycol is indexed and contains integers 0..1000; let's say each
such value appears ten times.  With a seqscan implementation (which I
consider is what SQL defines the semantics to be) random() would be
recomputed at each row and there would be about a 1/1000 chance of
selecting each row.  You might get more or less than exactly ten result
rows, and they'd almost certainly contain different values of keycol.
Now if random() were marked stable (and of course both multiply and
int() are immutable), then the planner would consider an indexscan on
keycol to be a valid optimization.  But that would produce
distinguishably different results, because random() would be evaluated
only once: you would always get exactly ten rows and they'd always all
have the same keycol value.

>> An indexscan is a legal optimization only if the function(s) in the
>> WHERE clause are all STABLE or better.  This is because the index access
>> code will only evaluate the righthand side of the "indexcol = something"
>> clause once, and then will use that value to descend the btree and
>> select matching index entries.  We must be certain that this gives the
>> same result we would get from a seqscan.
>> 
> Now this sounds like a blink of the problem that I don't understand :-) 
> When you say it evaluate "right side" ones, what kind of information are 
> you (the executer) then getting, and how is the index match then 
> performed.

An index can basically implement conditions like "WHERE indexedcol =
constant" --- it takes the constant value and searches the index for
matches.  (Btrees can also do things like WHERE indexedcol <= constant,
but let's just think about equality to keep things simple.)  We can deal
with a nonconstant righthand side, so long as it's okay to evaluate the
value just once before the index starts to do its thing.  That
assumption is what STABLE is all about.

			regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match