Re: slow join not using index properly

Ilya Kosmodemiansky <ilya.kosmodemiansky@xxxxxxxxxxxxxxxxxxxxxxxxx> · Tue, 25 Mar 2014 08:05:51 +0100

Hi Stefan,

stupid me - Ive missed some

with RECURSIVE qq(cont_key, anc_key) AS
  (

    SELECT
      a1.context_key, ancestor_key
    FROM
      virtual_ancestors a1
    UNION select ( -- here, in the union
      SELECT
        a1.context_key, a1.ancestor_key
      FROM
        virtual_ancestors a1
      WHERE
        a1.context_key > cont_key
      ORDER BY
        a1.context_key LIMIT 1
    )
from qq where cont_key is not null -- and here
  )
  SELECT
    distinct a.cont_key
  FROM
    qq a, collection_data, virtual_ancestors a2
  WHERE
    a.cont_key IS NOT NULL
    AND a.anc_key = collection_data.context_key
    AND collection_data.collection_
>
> context_key = a2.context_key
>     AND a2.ancestor_key = 1072173;

sorry for disorientating

On Mon, Mar 24, 2014 at 7:40 PM, Stefan Amshey <sramshey@xxxxxxxxx> wrote:
> Hi Ilya-
>
> Thanks so much for taking a stab at optimizing that query.  I had to fiddle
> a bit with your proposed version in order to get it function. Here's what I
> came up with in the end:
>>
>> with RECURSIVE qq(cont_key, anc_key) AS
>>   (
>>
>>     SELECT
>>       a1.context_key, ancestor_key
>>     FROM
>>       virtual_ancestors a1
>>     UNION (
>>       SELECT
>>         a1.context_key, a1.ancestor_key
>>       FROM
>>         virtual_ancestors a1, qq
>>       WHERE
>>         context_key > qq.cont_key
>>       ORDER BY
>>         context_key LIMIT 1
>>     )
>>   )
>>   SELECT
>>     distinct a.cont_key
>>   FROM
>>     qq a, collection_data, virtual_ancestors a2
>>   WHERE
>>     a.cont_key IS NOT NULL
>>     AND a.anc_key = collection_data.context_key
>>     AND collection_data.collection_context_key = a2.context_key
>>     AND a2.ancestor_key = 1072173;
>
>
> I had to drop the MIN( a1.context_key ) and LIMIT 1 from your version off of
> the first select statement in order to avoid syntax issues or other errors.
> The version above does produce the same counts as the original, but in the
> end it wasn't really a win for us. Here's the plan it produced:
>
>> HashAggregate  (cost=707724.36..707726.36 rows=200 width=4) (actual
>> time=27638.844..27639.706 rows=3522 loops=1)
>>    Buffers: shared hit=79323, temp read=49378 written=47557
>>    CTE qq
>>      ->  Recursive Union  (cost=0.00..398869.78 rows=10814203 width=8)
>> (actual time=0.018..20196.397 rows=10821685 loops=1)
>>            Buffers: shared hit=74449, temp read=49378 written=23779
>>            ->  Seq Scan on virtual_ancestors a1  (cost=0.00..182584.93
>> rows=10814193 width=8) (actual time=0.010..2585.411 rows=10821685 loops=1)
>>                  Buffers: shared hit=74443
>>            ->  Limit  (cost=0.00..0.08 rows=1 width=8) (actual
>> time=7973.297..7973.298 rows=1 loops=1)
>>                  Buffers: shared hit=6, temp read=49378 written=1
>>                  ->  Nested Loop  (cost=0.00..30881281719119.79
>> rows=389822567470830 width=8) (actual time=7973.296..7973.296 rows=1
>> loops=1)
>>                        Join Filter: (a1.context_key > qq.cont_key)
>>                        Rows Removed by Join Filter: 22470607
>>                        Buffers: shared hit=6, temp read=49378 written=1
>>                        ->  Index Scan using virtual_context_key_idx on
>> virtual_ancestors a1  (cost=0.00..18206859.46 rows=10814193 width=8) (actual
>> time=0.018..0.036 rows=3 loops=1)
>>                              Buffers: shared hit=6
>>                        ->  WorkTable Scan on qq  (cost=0.00..2162838.60
>> rows=108141930 width=4) (actual time=0.008..1375.445 rows=7490203 loops=3)
>>                              Buffers: temp read=49378 written=1
>>    ->  Hash Join  (cost=25283.37..308847.31 rows=2905 width=4) (actual
>> time=449.167..27629.759 rows=13152 loops=1)
>>          Hash Cond: (a.anc_key = collection_data.context_key)
>>          Buffers: shared hit=79323, temp read=49378 written=47557
>>          ->  CTE Scan on qq a  (cost=0.00..216284.06 rows=10760132
>> width=8) (actual time=0.021..25265.179 rows=10821685 loops=1)
>>                Filter: (cont_key IS NOT NULL)
>>                Buffers: shared hit=74449, temp read=49378 written=47557
>>          ->  Hash  (cost=25282.14..25282.14 rows=98 width=4) (actual
>> time=373.836..373.836 rows=2109 loops=1)
>>                Buckets: 1024  Batches: 1  Memory Usage: 75kB
>>                Buffers: shared hit=4874
>>                ->  Hash Join  (cost=17557.15..25282.14 rows=98 width=4)
>> (actual time=368.374..373.013 rows=2109 loops=1)
>>
>>                      Hash Cond: (a2.context_key =
>> collection_data.collection_context_key)
>>                      Buffers: shared hit=4874
>>                      ->  Index Only Scan using virtual_ancestors_pkey on
>> virtual_ancestors a2  (cost=0.00..238.57 rows=272 width=4) (actual
>> time=0.029..1.989 rows=1976 loops=1)
>>
>>                            Index Cond: (ancestor_key = 1072173)
>>                            Heap Fetches: 917
>>                            Buffers: shared hit=883
>>                      ->  Hash  (cost=10020.40..10020.40 rows=602940
>> width=8) (actual time=368.057..368.057 rows=603066 loops=1)
>>                            Buckets: 65536  Batches: 1  Memory Usage:
>> 23558kB
>>                            Buffers: shared hit=3991
>>                            ->  Seq Scan on collection_data
>> (cost=0.00..10020.40 rows=602940 width=8) (actual time=0.006..146.447
>> rows=603066 loops=1)
>>                                  Buffers: shared hit=3991
>>  Total runtime: 27854.200 ms
>
>
> I also tried including the MIN( a1.context_key ) in the first select
> statement as you had written it, but upon doing that it became necessary to
> add a GROUP BY clause, and doing that changed the final number of rows
> selected:
>>
>> ERROR:  column "a1.ancestor_key" must appear in the GROUP BY clause or be
>> used in an aggregate function
>> LINE 4:       min( a1.context_key ), ancestor_key
>
>                                      ^
>
> Including the LIMIT 1 at the end of the first select statement gave a syntax
> error that I couldn't seem to get past, so I think it might be invalid:
>>
>> ERROR:  syntax error at or near "UNION"
>> LINE 8:     UNION (
>>             ^
>
>
> So I landed on the version that I posted above, which seems to select the
> same set in all of the cases that I tried.
>
> Anyway, thanks again for taking a stab at helping, I do appreciate it. If
> you have any other ideas that might be of help I'd certainly be happy to
> hear them.
>
> Take care,
>      /Stefan
>
>
>
>
> On Thu, Mar 20, 2014 at 11:02 PM, Ilya Kosmodemiansky
> <ilya.kosmodemiansky@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
>>
>> Hi Stefan!
>>
>> Probably you need to rewrite your query like this (check it first):
>>
>> with RECURSIVE qq(cont_key, anc_key) as
>> (
>> select min(a1.context_key), ancestor_key from virtual_ancestors a1
>>  union select
>>   (SELECT
>>     a1.context_key, ancestor_key
>>   FROM
>>     virtual_ancestors a1 where context_key > cont_key order by
>> context_key limit 1) from qq where cont_key is not null
>> )
>> select a1.cont_key
>>  from qq a1, collection_data, virtual_ancestors a2
>> WHERE
>>     a1.anc_key =  collection_data.context_key
>>     AND collection_data.collection_context_key = a2.context_key
>>     AND a2.ancestor_key = ?
>>
>> best regards,
>> Ilya
>>
>> On Fri, Mar 21, 2014 at 12:56 AM, Stefan Amshey <sramshey@xxxxxxxxx>
>> wrote:
>> > We have a slow performing query that we are trying to improve, and it
>> > appears to be performing a sequential scan at a point where it should be
>> > utilizing an index. Can anyone tell me why postgres is opting to do it
>> > this
>> > way?
>> >
>> > The original query is as follows:
>> >
>> > SELECT DISTINCT
>> >     a1.context_key
>> > FROM
>> >     virtual_ancestors a1, collection_data, virtual_ancestors a2
>> > WHERE
>> >     a1.ancestor_key =  collection_data.context_key
>> >     AND collection_data.collection_context_key = a2.context_key
>> >     AND a2.ancestor_key = ?
>> >
>> > The key relationships should all using indexed columns, but the query
>> > plan
>> > that postgres comes up with ends up performing a sequential scan on the
>> > collection_data table (in this case about 602k rows) where we would have
>> > expected it to utilize the index:
>> >
>> >  HashAggregate  (cost=60905.73..60935.73 rows=3000 width=4) (actual
>> > time=3366.165..3367.354 rows=3492 loops=1)
>> >    Buffers: shared hit=16291 read=1222
>> >    ->  Nested Loop  (cost=17546.26..60898.23 rows=3000 width=4) (actual
>> > time=438.332..3357.918 rows=13037 loops=1)
>> >          Buffers: shared hit=16291 read=1222
>> >          ->  Hash Join  (cost=17546.26..25100.94 rows=98 width=4)
>> > (actual
>> > time=408.554..415.767 rows=2092 loops=1)
>> >                Hash Cond: (a2.context_key =
>> > collection_data.collection_context_key)
>> >                Buffers: shared hit=4850 read=3
>> >                ->  Index Only Scan using virtual_ancestors_pkey on
>> > virtual_ancestors a2  (cost=0.00..233.32 rows=270 width=4) (actual
>> > time=8.532..10.703 rows=1960 loops=1)
>> >                      Index Cond: (ancestor_key = 1072173)
>> >                      Heap Fetches: 896
>> >                      Buffers: shared hit=859 read=3
>> >                ->  Hash  (cost=10015.56..10015.56 rows=602456 width=8)
>> > (actual time=399.708..399.708 rows=602570 loops=1)
>> >                      Buckets: 65536  Batches: 1  Memory Usage: 23538kB
>> >                      Buffers: shared hit=3991
>> > ######## sequential scan occurs here ##########
>> >                      ->  Seq Scan on collection_data
>> > (cost=0.00..10015.56
>> > rows=602456 width=8) (actual time=0.013..163.509 rows=602570 loops=1)
>> >                            Buffers: shared hit=3991
>> >          ->  Index Only Scan using virtual_ancestors_pkey on
>> > virtual_ancestors a1  (cost=0.00..360.70 rows=458 width=8) (actual
>> > time=1.339..1.403 rows=6 loops=2092)
>> >                Index Cond: (ancestor_key = collection_data.context_key)
>> >                Heap Fetches: 7067
>> >                Buffers: shared hit=11441 read=1219
>> >  Total runtime: 3373.058 ms
>> >
>> >
>> > The table definitions are as follows:
>> >
>> >   Table "public.virtual_ancestors"
>> >     Column    |   Type   | Modifiers
>> > --------------+----------+-----------
>> >  ancestor_key | integer  | not null
>> >  context_key  | integer  | not null
>> >  degree       | smallint | not null
>> > Indexes:
>> >     "virtual_ancestors_pkey" PRIMARY KEY, btree (ancestor_key,
>> > context_key)
>> >     "virtual_context_key_idx" btree (context_key)
>> > Foreign-key constraints:
>> >     "virtual_ancestors_ancestor_key_fkey" FOREIGN KEY (ancestor_key)
>> > REFERENCES contexts(context_key)
>> >     "virtual_ancestors_context_key_fkey" FOREIGN KEY (context_key)
>> > REFERENCES contexts(context_key)
>> >
>> >              Table "public.collection_data"
>> >          Column              |         Type         | Modifiers
>> > ------------------------+----------------------+-----------
>> >  collection_context_key | integer              | not null
>> >  context_key                | integer              | not null
>> >  type                            | character varying(1) | not null
>> >  source                        | character varying(1) | not null
>> > Indexes:
>> >     "collection_data_context_key_idx" btree (context_key)
>> >     "collection_data_context_key_index" btree (collection_context_key)
>> > CLUSTER
>> > Foreign-key constraints:
>> >     "collection_data_collection_context_key_fkey" FOREIGN KEY
>> > (collection_context_key) REFERENCES contexts(context_key) ON DELETE
>> > CASCADE
>> >     "collection_data_context_key_fkey" FOREIGN KEY (context_key)
>> > REFERENCES
>> > contexts(context_key) ON DELETE CASCADE
>> >
>> > Can anyone suggest a way that we can get postgres to use the
>> > collection_data_context_key_index properly? I thought that it might be
>> > related to the fact that collection_data_context_key_index is a
>> > CLUSTERED
>> > index, but we did some basic experimentation that seems to indicate
>> > otherwise, i.e. the bad plan persists despite re-clustering the index.
>> >
>> > We are using PostgreSQL 9.2.5 on x86_64-unknown-linux-gnu, compiled by
>> > gcc
>> > (Debian 4.4.5-8) 4.4.5, 64-bit
>> >
>> > Interestingly, on an instance running PostgreSQL 9.2.4 on
>> > x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.4.5-8) 4.4.5, 64-bit
>> > where I copied the 2 tables over to a temporary database, the plan comes
>> > out
>> > differently:
>> >
>> >  HashAggregate  (cost=39692.03..39739.98 rows=4795 width=4) (actual
>> > time=73.285..75.141 rows=3486 loops=1)
>> >    Buffers: shared hit=22458
>> >    ->  Nested Loop  (cost=0.00..39680.05 rows=4795 width=4) (actual
>> > time=0.077..63.116 rows=13007 loops=1)
>> >          Buffers: shared hit=22458
>> >          ->  Nested Loop  (cost=0.00..32823.38 rows=164 width=4) (actual
>> > time=0.056..17.685 rows=2084 loops=1)
>> >                Buffers: shared hit=7529
>> >                ->  Index Only Scan using virtual_ancestors_pkey on
>> > virtual_ancestors a2  (cost=0.00..1220.85 rows=396 width=4) (actual
>> > time=0.025..2.732 rows=1954 loops=1)
>> >                      Index Cond: (ancestor_key = 1072173)
>> >                      Heap Fetches: 1954
>> >                      Buffers: shared hit=1397
>> > ######## Note the index scan here - this is what it SHOULD be doing
>> > ##############
>> >                ->  Index Scan using collection_data_context_key_index on
>> > collection_data  (cost=0.00..79.24 rows=56 width=8) (actual
>> > time=0.004..0.005 rows=1 loops=1954)
>> >                      Index Cond: (collection_context_key =
>> > a2.context_key)
>> >                      Buffers: shared hit=6132
>> >          ->  Index Only Scan using virtual_ancestors_pkey on
>> > virtual_ancestors a1  (cost=0.00..35.40 rows=641 width=8) (actual
>> > time=0.007..0.015 rows=6 loops=2084)
>> >                Index Cond: (ancestor_key = collection_data.context_key)
>> >                Heap Fetches: 13007
>> >                Buffers: shared hit=14929
>> >  Total runtime: 76.431 ms
>> >
>> > Why can't I get the Postgres 9.2.5 instance to use the optimal plan?
>> >
>> > Thanks in advance!
>> >      /Stefan
>> >
>> > --
>> > -
>> > Stefan Amshey
>>
>>
>>
>> --
>> Ilya Kosmodemiansky,
>>
>> PostgreSQL-Consulting.com
>> tel. +14084142500
>> cell. +4915144336040
>> ik@xxxxxxxxxxxxxxxxxxxxxxxxx
>
>
>
>
> --
> -
> Stefan Amshey

-- 
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@xxxxxxxxxxxxxxxxxxxxxxxxx

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance