Search Postgresql Archives

Re: Incorrect index being used

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/10/2013 12:10, Albe Laurenz wrote:
Jesse Long wrote:
I have the following query, run immediately after executing VACUUM in
the database. There is only one connection to the database.
You should run ANALYZE, not VACUUM.

The query runs for much longer than I expect it to run for, and I think
this is due to it using the incorrect subplan. As you can see, subplans
1 and 3 make use of and index, but these subplans are not used.
Subplans  and 4 are seqscan, and they are used.

How can I get PostgreSQL to use subplan 1 and 3?
They are only possible if an "Index Only Scan" is possible, which
can only be used if the respective table entries are visible for
all transactions.

testdb=> explain analyse SELECT * FROM ARCHIVE_DOCUMENT AS r0 WHERE r0.NODE_ID = 29 AND
r0.ARCHIVE_DATE >= '2013-07-08 18:28:00' AND (EXISTS (SELECT r1.* FROM ARCHIVE_DOCUMENT_INDEX AS r1
WHERE r1.ARCHIVE_ID = r0.ID AND r1.NODE_ID = r0.NODE_ID AND r1.VALUE = 'BSH70002152' ) OR EXISTS (
SELECT r2.* FROM ARCHIVE_DOCUMENT_INDEX AS r2 WHERE r2.ARCHIVE_ID = r0.ID AND r2.NODE_ID = r0.NODE_ID
AND r2.VALUE = 'TC212592' ) ) ORDER BY r0.ARCHIVE_DATE DESC LIMIT 10;
[...]

           Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR (alternatives: SubPlan 3 or hashed
SubPlan 4))

           Rows Removed by Filter: 710851

           SubPlan 1

             ->  Index Only Scan using archive_document_index_x_archive_id_node_id_value on
archive_document_index r1  (cost=0.57..4.59 rows=1 width=0) (never executed)

                   Index Cond: ((archive_id = r0.id) AND (node_id = r0.node_id) AND (value =
'BSH70002152'::text))

                   Heap Fetches: 0

           SubPlan 2

             ->  Seq Scan on archive_document_index r1_1  (cost=0.00..1958104.00 rows=1520 width=16)
(actual time=44418.383..44558.293 rows=4 loops=1)

                   Filter: ((value)::text = 'BSH70002152'::text)

                   Rows Removed by Filter: 95009919

           SubPlan 3

             ->  Index Only Scan using archive_document_index_x_archive_id_node_id_value on
archive_document_index r2  (cost=0.57..4.59 rows=1 width=0) (never executed)

                   Index Cond: ((archive_id = r0.id) AND (node_id = r0.node_id) AND (value =
'TC212592'::text))

                   Heap Fetches: 0

           SubPlan 4

             ->  Seq Scan on archive_document_index r2_1  (cost=0.00..1958104.00 rows=1520 width=16)
(actual time=41659.464..41663.342 rows=1 loops=1)

                   Filter: ((value)::text = 'TC212592'::text)

                   Rows Removed by Filter: 95009922
The estimates are quite off.
Does "ANALYZE archive_document", possibly after increasing
default_statistics_target, make a difference?

Yours,
Laurenz Albe


Hi Laurenz,

Thank you for the feedback.

There is no problem with row visibility, there is only one connection to the database - the connection I am using to do these selects.

Thanks you for the advise regarding ANALYZE. I ran ANALYZE on both tables concerned, but not much changed:

                                                                                             QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------

 Limit  (cost=0.56..151.85 rows=10 width=122) (actual time=40841.984..85668.213 rows=2 loops=1)

   ->  Index Scan Backward using idx_archive_document_x_node_id_archive_date on archive_document r0  (cost=0.56..7627640.20 rows=504186 width=122) (actual time=40841.98

         Index Cond: ((node_id = 29) AND (archive_date >= '2013-07-08 18:28:00'::timestamp without time zone))

         Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR (alternatives: SubPlan 3 or hashed SubPlan 4))

         Rows Removed by Filter: 710851

         SubPlan 1

           ->  Index Only Scan using archive_document_index_x_archive_id_node_id_value on archive_document_index r1  (cost=0.57..4.59 rows=1 width=0) (never executed)

                 Index Cond: ((archive_id = r0.id) AND (node_id = r0.node_id) AND (value = 'BSH70002152'::text))

                 Heap Fetches: 0

         SubPlan 2

           ->  Seq Scan on archive_document_index r1_1  (cost=0.00..1958101.80 rows=1568 width=16) (actual time=36633.365..40841.909 rows=4 loops=1)

                 Filter: ((value)::text = 'BSH70002152'::text)

                 Rows Removed by Filter: 95009919

         SubPlan 3

           ->  Index Only Scan using archive_document_index_x_archive_id_node_id_value on archive_document_index r2  (cost=0.57..4.59 rows=1 width=0) (never executed)

                 Index Cond: ((archive_id = r0.id) AND (node_id = r0.node_id) AND (value = 'TC212592'::text))

                 Heap Fetches: 0

         SubPlan 4

           ->  Seq Scan on archive_document_index r2_1  (cost=0.00..1958101.80 rows=1568 width=16) (actual time=40241.599..44462.485 rows=1 loops=1)

                 Filter: ((value)::text = 'TC212592'::text)

                 Rows Removed by Filter: 95009922

 Total runtime: 85676.734 ms

(22 rows)

Thanks,
Jesse


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux