Search Postgresql Archives

Re: Index optimization ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Okay, let's look at this a different way.

When you look at a volatile function or variable, let's say CURRENT_TIMESTAMP (which returns the current date and time as of the beginning of the transaction), you see a function or variable whose value changes unpredictably between calls/reads.

Now let's look at that value being inserted into a table. Say table X has column Y. We insert a few rows into X, and use CURRENT_TIMESTAMP as the value for Y.

So we have something like this (example only):

SELECT Y FROM X;

Y
- -----
1
4
6
9
14

Later, we want to find all of the rows we just inserted. We can't very well do this using Y, since:

SELECT * FROM X WHERE Y = CURRENT_TIMESTAMP;

When CURRENT_TIMESTAMP is something like, say, 44, will not find any of the same rows.


Now expand on this to look at an index on a volatile function. When the row inserts/updates take place, the function has a specific value which it returns. Now we can index this return value, but when we go to search the results, we evaluate the function and get a *DIFFERENT VALUE* -- thus, our search of the index reveals a *different* set of rows than the one we were hoping to find, since the value it is looking for (the result of the function call) is not the same as it was when we built the index.


Why would you *want* to do this? It is roughly equivalent to building an index on random values, to return a random set of rows -- and this is just as. if not more efficiently done without the index.


Again, looking at the ODD function, consider the above where column Z becomes odd():


Y       Z
- ------------
1        1
4        0
6        1
9        0
14      1

Now run SELECT Y, Z FROM X WHERE Z = ODD() to get:

Y       Z
- ------------
4        0
9        0


Run the same query again to get:

Y       Z
- ------------
1        1
6        1
14      1



You see that the results will be inconsistent, since ODD() is volatile and its result can change at any time (at least as far as PostgreSQL is aware of it).


On Jan 17, 2005, at 10:30 AM, Bo Lorentsen wrote:

Tom Lane wrote:

No, you'd still end up with a seqscan, because this WHERE clause offers
no chance of matching an index, and we don't do anything special with
stable functions beyond trying to match them to index conditions.


So, the executer uses the (first) value to find the index to use for ALL rows, and if this value change on each row, this can't be optimized and a seq scan is initiated.

Is this not a problem for joins ?

But consider something like

	SELECT * FROM mytable WHERE keycol = int(random() * 1000);

where keycol is indexed and contains integers 0..1000; let's say each
such value appears ten times.  With a seqscan implementation (which I
consider is what SQL defines the semantics to be) random() would be
recomputed at each row and there would be about a 1/1000 chance of
selecting each row.

This would demand a new index lookup for each row, right ?

You might get more or less than exactly ten result
rows, and they'd almost certainly contain different values of keycol.

This much i do understand :-)

Now if random() were marked stable (and of course both multiply and
int() are immutable), then the planner would consider an indexscan on
keycol to be a valid optimization.  But that would produce
distinguishably different results, because random() would be evaluated
only once: you would always get exactly ten rows and they'd always all
have the same keycol value.

I know why random (and currval) is not stabel, but I just don't understand why a variable righthand result in seq scan, and not an index scan, even when the data types match an index type.

To me it sounds like an index lookup is a one time a query (not per row) thing, but I don't understand why. This can be because, this is the way it turned up, but there is more possibly an aspect of SQL that I don't know too much about.

An index can basically implement conditions like "WHERE indexedcol =
constant" --- it takes the constant value and searches the index for
matches. (Btrees can also do things like WHERE indexedcol <= constant,
but let's just think about equality to keep things simple.)


:-)

We can deal
with a nonconstant righthand side, so long as it's okay to evaluate the
value just once before the index starts to do its thing. That
assumption is what STABLE is all about.


So righthand value can't evaluate per row, and the value type of the righthand expression can't be used as a index match.

I just hoped for the executer to work like this :

find indexedcol indexs

evaluate the righthand expression, and find its type (not value)

match the righthand value type and match it on index types (is both sides integer)

if index is found use this together with the per row righthand value

or just use seq scan (I don't understand why, this works if indexes don't)

This is what I thought PG was doing :-)

Hope, I did not miss any important points.

/BL


---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings



- -----------------------------------------------------------
Frank D. Engel, Jr.  <fde101@xxxxxxxxxxx>

$ ln -s /usr/share/kjvbible /usr/manual
$ true | cat /usr/manual | grep "John 3:16"
John 3:16 For God so loved the world, that he gave his only begotten Son, that whosoever believeth in him should not perish, but have everlasting life.
$
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)


iD8DBQFB698N7aqtWrR9cZoRAqxeAJsFGjPqvRlk5tBkW50uxzbarrJfbwCeOIj4
aHnMMXCeFXt61Ziji25h1+E=
=PCKJ
-----END PGP SIGNATURE-----



___________________________________________________________
$0 Web Hosting with up to 120MB web space, 1000 MB Transfer
10 Personalized POP and Web E-mail Accounts, and much more.
Signup at www.doteasy.com


---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@xxxxxxxxxxxxxx

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux