On 09/10/2013 12:57, Jesse Long wrote:
On 09/10/2013 12:10, Albe Laurenz wrote:
Jesse Long wrote:
I have the following query, run immediately after executing VACUUM in
the database. There is only one connection to the database.
You should run ANALYZE, not VACUUM.
The query runs for much longer than I expect it to run for, and I think
this is due to it using the incorrect subplan. As you can see, subplans
1 and 3 make use of and index, but these subplans are not used.
Subplans and 4 are seqscan, and they are used.
How can I get PostgreSQL to use subplan 1 and 3?
They are only possible if an "Index Only Scan" is possible, which
can only be used if the respective table entries are visible for
all transactions.
testdb=> explain analyse SELECT * FROM ARCHIVE_DOCUMENT AS r0 WHERE
r0.NODE_ID = 29 AND
r0.ARCHIVE_DATE >= '2013-07-08 18:28:00' AND (EXISTS (SELECT r1.*
FROM ARCHIVE_DOCUMENT_INDEX AS r1
WHERE r1.ARCHIVE_ID = r0.ID AND r1.NODE_ID = r0.NODE_ID AND r1.VALUE
= 'BSH70002152' ) OR EXISTS (
SELECT r2.* FROM ARCHIVE_DOCUMENT_INDEX AS r2 WHERE r2.ARCHIVE_ID =
r0.ID AND r2.NODE_ID = r0.NODE_ID
AND r2.VALUE = 'TC212592' ) ) ORDER BY r0.ARCHIVE_DATE DESC LIMIT 10;
[...]
Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR
(alternatives: SubPlan 3 or hashed
SubPlan 4))
Rows Removed by Filter: 710851
SubPlan 1
-> Index Only Scan using
archive_document_index_x_archive_id_node_id_value on
archive_document_index r1 (cost=0.57..4.59 rows=1 width=0) (never
executed)
Index Cond: ((archive_id = r0.id) AND (node_id =
r0.node_id) AND (value =
'BSH70002152'::text))
Heap Fetches: 0
SubPlan 2
-> Seq Scan on archive_document_index r1_1
(cost=0.00..1958104.00 rows=1520 width=16)
(actual time=44418.383..44558.293 rows=4 loops=1)
Filter: ((value)::text = 'BSH70002152'::text)
Rows Removed by Filter: 95009919
SubPlan 3
-> Index Only Scan using
archive_document_index_x_archive_id_node_id_value on
archive_document_index r2 (cost=0.57..4.59 rows=1 width=0) (never
executed)
Index Cond: ((archive_id = r0.id) AND (node_id =
r0.node_id) AND (value =
'TC212592'::text))
Heap Fetches: 0
SubPlan 4
-> Seq Scan on archive_document_index r2_1
(cost=0.00..1958104.00 rows=1520 width=16)
(actual time=41659.464..41663.342 rows=1 loops=1)
Filter: ((value)::text = 'TC212592'::text)
Rows Removed by Filter: 95009922
The estimates are quite off.
Does "ANALYZE archive_document", possibly after increasing
default_statistics_target, make a difference?
Yours,
Laurenz Albe
Hi Laurenz,
Thank you for the feedback.
There is no problem with row visibility, there is only one connection
to the database - the connection I am using to do these selects.
Thanks you for the advise regarding ANALYZE. I ran ANALYZE on both
tables concerned, but not much changed:
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.56..151.85 rows=10 width=122) (actual
time=40841.984..85668.213 rows=2 loops=1)
-> Index Scan Backward using
idx_archive_document_x_node_id_archive_date on archive_document r0
(cost=0.56..7627640.20 rows=504186 width=122) (actual time=40841.98
Index Cond: ((node_id = 29) AND (archive_date >= '2013-07-08
18:28:00'::timestamp without time zone))
Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR
(alternatives: SubPlan 3 or hashed SubPlan 4))
Rows Removed by Filter: 710851
SubPlan 1
-> Index Only Scan using
archive_document_index_x_archive_id_node_id_value on
archive_document_index r1 (cost=0.57..4.59 rows=1 width=0) (never
executed)
Index Cond: ((archive_id = r0.id) AND (node_id =
r0.node_id) AND (value = 'BSH70002152'::text))
Heap Fetches: 0
SubPlan 2
-> Seq Scan on archive_document_index r1_1
(cost=0.00..1958101.80 rows=1568 width=16) (actual
time=36633.365..40841.909 rows=4 loops=1)
Filter: ((value)::text = 'BSH70002152'::text)
Rows Removed by Filter: 95009919
SubPlan 3
-> Index Only Scan using
archive_document_index_x_archive_id_node_id_value on
archive_document_index r2 (cost=0.57..4.59 rows=1 width=0) (never
executed)
Index Cond: ((archive_id = r0.id) AND (node_id =
r0.node_id) AND (value = 'TC212592'::text))
Heap Fetches: 0
SubPlan 4
-> Seq Scan on archive_document_index r2_1
(cost=0.00..1958101.80 rows=1568 width=16) (actual
time=40241.599..44462.485 rows=1 loops=1)
Filter: ((value)::text = 'TC212592'::text)
Rows Removed by Filter: 95009922
Total runtime: 85676.734 ms
(22 rows)
Sorry, I neglected to mention that this is on PostgreSQL 9.3.0.
Thanks,
Jesse
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general