Hi Stefan! Probably you need to rewrite your query like this (check it first): with RECURSIVE qq(cont_key, anc_key) as ( select min(a1.context_key), ancestor_key from virtual_ancestors a1 union select (SELECT a1.context_key, ancestor_key FROM virtual_ancestors a1 where context_key > cont_key order by context_key limit 1) from qq where cont_key is not null ) select a1.cont_key from qq a1, collection_data, virtual_ancestors a2 WHERE a1.anc_key = collection_data.context_key AND collection_data.collection_context_key = a2.context_key AND a2.ancestor_key = ? best regards, Ilya On Fri, Mar 21, 2014 at 12:56 AM, Stefan Amshey <sramshey@xxxxxxxxx> wrote: > We have a slow performing query that we are trying to improve, and it > appears to be performing a sequential scan at a point where it should be > utilizing an index. Can anyone tell me why postgres is opting to do it this > way? > > The original query is as follows: > > SELECT DISTINCT > a1.context_key > FROM > virtual_ancestors a1, collection_data, virtual_ancestors a2 > WHERE > a1.ancestor_key = collection_data.context_key > AND collection_data.collection_context_key = a2.context_key > AND a2.ancestor_key = ? > > The key relationships should all using indexed columns, but the query plan > that postgres comes up with ends up performing a sequential scan on the > collection_data table (in this case about 602k rows) where we would have > expected it to utilize the index: > > HashAggregate (cost=60905.73..60935.73 rows=3000 width=4) (actual > time=3366.165..3367.354 rows=3492 loops=1) > Buffers: shared hit=16291 read=1222 > -> Nested Loop (cost=17546.26..60898.23 rows=3000 width=4) (actual > time=438.332..3357.918 rows=13037 loops=1) > Buffers: shared hit=16291 read=1222 > -> Hash Join (cost=17546.26..25100.94 rows=98 width=4) (actual > time=408.554..415.767 rows=2092 loops=1) > Hash Cond: (a2.context_key = > collection_data.collection_context_key) > Buffers: shared hit=4850 read=3 > -> Index Only Scan using virtual_ancestors_pkey on > virtual_ancestors a2 (cost=0.00..233.32 rows=270 width=4) (actual > time=8.532..10.703 rows=1960 loops=1) > Index Cond: (ancestor_key = 1072173) > Heap Fetches: 896 > Buffers: shared hit=859 read=3 > -> Hash (cost=10015.56..10015.56 rows=602456 width=8) > (actual time=399.708..399.708 rows=602570 loops=1) > Buckets: 65536 Batches: 1 Memory Usage: 23538kB > Buffers: shared hit=3991 > ######## sequential scan occurs here ########## > -> Seq Scan on collection_data (cost=0.00..10015.56 > rows=602456 width=8) (actual time=0.013..163.509 rows=602570 loops=1) > Buffers: shared hit=3991 > -> Index Only Scan using virtual_ancestors_pkey on > virtual_ancestors a1 (cost=0.00..360.70 rows=458 width=8) (actual > time=1.339..1.403 rows=6 loops=2092) > Index Cond: (ancestor_key = collection_data.context_key) > Heap Fetches: 7067 > Buffers: shared hit=11441 read=1219 > Total runtime: 3373.058 ms > > > The table definitions are as follows: > > Table "public.virtual_ancestors" > Column | Type | Modifiers > --------------+----------+----------- > ancestor_key | integer | not null > context_key | integer | not null > degree | smallint | not null > Indexes: > "virtual_ancestors_pkey" PRIMARY KEY, btree (ancestor_key, context_key) > "virtual_context_key_idx" btree (context_key) > Foreign-key constraints: > "virtual_ancestors_ancestor_key_fkey" FOREIGN KEY (ancestor_key) > REFERENCES contexts(context_key) > "virtual_ancestors_context_key_fkey" FOREIGN KEY (context_key) > REFERENCES contexts(context_key) > > Table "public.collection_data" > Column | Type | Modifiers > ------------------------+----------------------+----------- > collection_context_key | integer | not null > context_key | integer | not null > type | character varying(1) | not null > source | character varying(1) | not null > Indexes: > "collection_data_context_key_idx" btree (context_key) > "collection_data_context_key_index" btree (collection_context_key) > CLUSTER > Foreign-key constraints: > "collection_data_collection_context_key_fkey" FOREIGN KEY > (collection_context_key) REFERENCES contexts(context_key) ON DELETE CASCADE > "collection_data_context_key_fkey" FOREIGN KEY (context_key) REFERENCES > contexts(context_key) ON DELETE CASCADE > > Can anyone suggest a way that we can get postgres to use the > collection_data_context_key_index properly? I thought that it might be > related to the fact that collection_data_context_key_index is a CLUSTERED > index, but we did some basic experimentation that seems to indicate > otherwise, i.e. the bad plan persists despite re-clustering the index. > > We are using PostgreSQL 9.2.5 on x86_64-unknown-linux-gnu, compiled by gcc > (Debian 4.4.5-8) 4.4.5, 64-bit > > Interestingly, on an instance running PostgreSQL 9.2.4 on > x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.4.5-8) 4.4.5, 64-bit > where I copied the 2 tables over to a temporary database, the plan comes out > differently: > > HashAggregate (cost=39692.03..39739.98 rows=4795 width=4) (actual > time=73.285..75.141 rows=3486 loops=1) > Buffers: shared hit=22458 > -> Nested Loop (cost=0.00..39680.05 rows=4795 width=4) (actual > time=0.077..63.116 rows=13007 loops=1) > Buffers: shared hit=22458 > -> Nested Loop (cost=0.00..32823.38 rows=164 width=4) (actual > time=0.056..17.685 rows=2084 loops=1) > Buffers: shared hit=7529 > -> Index Only Scan using virtual_ancestors_pkey on > virtual_ancestors a2 (cost=0.00..1220.85 rows=396 width=4) (actual > time=0.025..2.732 rows=1954 loops=1) > Index Cond: (ancestor_key = 1072173) > Heap Fetches: 1954 > Buffers: shared hit=1397 > ######## Note the index scan here - this is what it SHOULD be doing > ############## > -> Index Scan using collection_data_context_key_index on > collection_data (cost=0.00..79.24 rows=56 width=8) (actual > time=0.004..0.005 rows=1 loops=1954) > Index Cond: (collection_context_key = a2.context_key) > Buffers: shared hit=6132 > -> Index Only Scan using virtual_ancestors_pkey on > virtual_ancestors a1 (cost=0.00..35.40 rows=641 width=8) (actual > time=0.007..0.015 rows=6 loops=2084) > Index Cond: (ancestor_key = collection_data.context_key) > Heap Fetches: 13007 > Buffers: shared hit=14929 > Total runtime: 76.431 ms > > Why can't I get the Postgres 9.2.5 instance to use the optimal plan? > > Thanks in advance! > /Stefan > > -- > - > Stefan Amshey -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 ik@xxxxxxxxxxxxxxxxxxxxxxxxx -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance