Re: window function induces full table scan

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Just to track it down: The limitation can also be reproduced without using views. Using views is just a use case where the suggested optimization is actually needed.

Plus, when I remove the condition "WHERE datepos = 1", the same behaviour still occurs. Here, I wanted to see if postgresql is preferring the condition "WHERE datepos = 1" (datepos is the result of the window function) over the condition "user_id = 43" for optimization. But this is not the case.

-- workaround example: "WHERE user_id = 43" condition in subselect

SELECT user_id, latitude, longitude
FROM (
       SELECT
         user_id,
         latitude,
         longitude,
rank() OVER (PARTITION BY user_id ORDER BY date DESC, id DESC) AS datepos
       FROM checkin_node
       WHERE user_id = 43
     ) AS tmp_last_position
WHERE datepos = 1; -- takes 2 ms

-- track it down: reproduce limitation without a view:

SELECT user_id, latitude, longitude
FROM (
       SELECT
         user_id,
         latitude,
         longitude,
         rank() OVER (PARTITION BY user_id ORDER BY date DESC, id DESC)
           AS datepos
       FROM checkin_node
     ) AS tmp_last_position
WHERE datepos = 1
AND user_id = 43; -- takes 6621 ms

-- without datepos condition

SELECT user_id, latitude, longitude
FROM (
       SELECT
         user_id,
         latitude,
         longitude,
         rank() OVER (PARTITION BY user_id ORDER BY date DESC, id DESC)
           AS datepos
       FROM checkin_node
     ) AS tmp_last_position
WHERE user_id = 43; -- takes 6574 ms

Best regards,
Thomas


Am 03.01.2014 00:12, schrieb Thomas Mayer:

Am 02.01.2014 23:43, schrieb Tom Lane:
Jeff Janes <jeff.janes@xxxxxxxxx> writes:
On Thu, Jan 2, 2014 at 1:52 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
It's possible that in the specific case you exhibit here, pushing down
the clause wouldn't result in changes in the window function's output for
the selected rows, but the optimizer doesn't have enough knowledge about
window functions to determine that.

A restriction in the WHERE clause which corresponds to the PARTITION BY
should be pushable, no?  I think it doesn't need to understand the internal
semantics of the window function itself, just of the PARTITION BY, which
should be doable, at least in principle.

If the restriction clause must give the same answer for any two rows of
the same partition, then yeah, we could in principle push it down without
knowing anything about the specific window function.  It'd be a less than
trivial test to make, I think.  In any case, it's not a "bug" that the
optimizer doesn't do this currently.

I agree, this is not a "bug" in v9.3.2 in terms of correctness.

But it's a limitation, because the query plan is by far not optimal. You
may consider this report as a feature request then.

The optimization I suggested is normally performed, when no window
function occurs in the statement.

It seems like the optimizer is already capable of doing a check if the
WHERE can be done first.

However, this check seems to be done too conservative: I guess, the
check is ignoring the PARTITION-BY-sets of attributes completely.


			regards, tom lane
.


Best regards
Thomas



--
======================================
Thomas Mayer
Durlacher Allee 61
D-76131 Karlsruhe
Telefon: +49-721-2081661
Fax:     +49-721-72380001
Mobil:   +49-174-2152332
E-Mail:  thomas.mayer@xxxxxxxxxxxxxxx
=======================================



--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance




[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux