Olivier Sirven wrote:
The query is slow but it works fine as it completes in less than 1 second.
The problem is that if I change the filter value of id_category from 15 to 3
the query will take more than 7 minutes to complete! The only difference
between id_category 3 and 15 is that there is about 4000 rows in the first
one (id_category = 3) and 2000 rows in the second one (id_category = 15).
An explain give me the following result:
EXPLAIN ANALYSE would be more useful - it'll show whether the row
estimates are actually accurate.
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..9677.68 rows=20 width=4)
-> Nested Loop (cost=0.00..61006657.19 rows=126077 width=4)
-> Nested Loop (cost=0.00..59991538.61 rows=252145 width=12)
-> Index Scan Backward using generals_topics_pkey on
generals_topics gt (cost=0.00..615679.86 rows=14750423 width=8)
-> Index Scan using ix_category_generals_id_general on
category_generals cs (cost=0.00..4.01 rows=1 width=4)
Index Cond: ("outer".id_general = cs.id_general)
Filter: (id_category = 3)
-> Index Scan using generals_id_topic_key on generals g
(cost=0.00..4.01 rows=1 width=4)
Index Cond: (g.id_general = "outer".id_general)
Filter: media
As you can see, every rows of generals_topics table is scanned and I don't
understand why? How can I do to make postgresql to work only with the tuples
resulting from the join conditions? Is it a configuration problem ?
It thinks you're going to get 126077 rows back at the top level. VACUUM
your table(s), ANALYSE them and then let's look at the EXPLAIN ANALYSE
for this query. It might be then that we need to increase the statistics
on one or more columns.
--
Richard Huxton
Archonet Ltd