Search Postgresql Archives

Query plan different depending on the value of where filter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi All,

 

I have an interesting query scenario I’m trying to understand.

I came across this while investigating a slow query in our application.

I’ve been able to reproduce the scenario in a psql script that sets up the tables and runs the queries.

Script here http://pastebin.com/CBkdDmWp if anyone is interested.

 

This is the scenario.

Version is "PostgreSQL 9.2.4, compiled by Visual C++ build 1600, 32-bit"

Three tables

Outer_tab  : 5000 rows

Inner_tab_1 : 1000 rows

Inner_tab_2 : 16000 rows

 

This is the query

SELECT outer_tab.outer_key

  FROM outer_tab

WHERE outer_tab.outer_key IN (

           SELECT inner_tab_1.key_to

             FROM inner_tab_2

            INNER JOIN inner_tab_1 ON (inner_tab_2.outer_key = inner_tab_1.key_from AND inner_tab_1.type = 2)

            WHERE outer_tab.outer_key = inner_tab_1.key_to AND inner_tab_2.group_id = 9

            );

 

Two important things here, type = 2 does not occur in inner_tab_1 and group_id = 9 does not occur in inner_tab_2 and group_id is not indexed.

The result is 0 rows.

Now this is quite slow about 15 seconds on my machine.

Here is the explain plan http://explain.depesz.com/s/BVg

I understand that the seq scan on inner_tab_2 and its 16000 rows is the culprit and the easy fix is to swap inner_tab_2 and inner_tab_1 between the FROM and the JOIN.

This lets it drive off inner_tab_1 with an index scan and skip the sequential scan as seen here http://explain.depesz.com/s/pkG

Much better at 14ms.

 

That’s fine but what has me somewhat confused is if group_id in the WHERE is changed to 1, which does exist in inner_tab_2, we get quite a different plan.

http://explain.depesz.com/s/FX4

It’s quick too 63ms

 

What I don’t understand is why the plan is different just because the group_id = has changed value?

Does the planner have some statistical info on the contents of non-indexed rows?

I don’t quite understand why this plan executes the sequential scan once, whereas the slow one does it 5001 times, which I believe is the main source of the difference.

 

Also if I don’t ANALYZE  the tables the original query will run in a few ms instead of 15 seconds, it actually uses the same query plan that swapping the tables creates.

So it runs the index scan on inner_tab_1 first.

It’s a bit surprising that with ANALYSE it picks a plan that is so much worse.

Any one able to shed some light?

 

Thanks for your time,

 

Denis Looby

 


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux