Two "equivalent" WITH RECURSIVE queries, one of them slow.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello.

I have a tree-like table with a three-field PK (name, date, id) and one parent field. It has 5k to 6k records as of now, but it will hold about 1 million records.

I am trying the following WITH RECURSIVE query:

WITH RECURSIVE t AS (
SELECT par.id AS tid, par.name, par.date, par.id, par.text, par.h_title, par.h_name, par.parent
                   FROM _books.par
        UNION
SELECT t.tid AS pid, p.name, p.date, p.id, p.text, p.h_title, p.h_name, p.parent
                   FROM t, _books.par p
WHERE p.name = t.name AND p.date = t.date AND t.id = p.parent
        )
 SELECT t.tid, t.name, t.date, t.id, t.text, t.h_title, t.h_name, t.parent
   FROM t WHERE name = 'cfx' AND date = '2009-08-19' AND tid = '28340';

... which takes 2547.503 ms

However, if I try the same query but adding the same WHERE clause to the
non-recursive term, I get much better results.


WITH RECURSIVE t AS (
SELECT par.id AS tid, par.name, par.date, par.id, par.text, par.h_title, par.h_name, par.parent FROM _books.par WHERE name = 'cfx' AND date = '2009-08-19' AND par.id = '28340'
        UNION
SELECT t.tid AS pid, p.name, p.date, p.id, p.text, p.h_title, p.h_name, p.parent
                   FROM t, _books.par p
WHERE p.name = t.name AND p.date = t.date AND t.id = p.parent
        )
 SELECT t.tid, t.name, t.date, t.id, t.text, t.h_title, t.h_name, t.parent
   FROM t WHERE name = 'cfx' AND date = '2009-08-19' AND tid = '28340';

... which takes 0.221 ms

I am being forced to use the slow query because I want to define it as a
view, leaving the WHERE clause to the application.

I fail to see where the two queries might be different, or, what cases the
slow one considers that the fast one doesn't, as to get a clue on how to
workaround this.

I have taken the EXPLAIN ANALYZE output for both queries. It looks like
the slow one is processing all records (read: not adding the WHERE clause
to the non-recursive term).


QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
CTE Scan on t (cost=96653.20..96820.57 rows=1 width=144) (actual time=32.931..2541.792 rows=1 loops=1) Filter: (((name)::text = 'cfx'::text) AND (date = '2009-08-19'::date) AND (tid = 28340))
   CTE t
-> Recursive Union (cost=0.00..96653.20 rows=6086 width=212) (actual time=0.017..2442.655 rows=33191 loops=1) -> Seq Scan on par (cost=0.00..237.96 rows=5996 width=208) (actual time=0.011..5.591 rows=5996 loops=1) -> Merge Join (cost=8909.74..9629.35 rows=9 width=212) (actual time=225.979..254.727 rows=3022 loops=9) Merge Cond: (((t.name)::text = (p.name)::text) AND (t.date = p.date) AND (t.id = p.parent)) -> Sort (cost=7700.54..7850.44 rows=59960 width=44) (actual time=58.163..59.596 rows=3685 loops=9)
                       Sort name: t.name, t.date, t.id
                       Sort Method:  quicksort  Memory: 17kB
-> WorkTable Scan on t (cost=0.00..1199.20 rows=59960 width=44) (actual time=0.027..3.486 rows=3688 loops=9) -> Materialize (cost=1209.20..1284.15 rows=5996 width=208) (actual time=163.062..177.415 rows=5810 loops=9) -> Sort (cost=1209.20..1224.19 rows=5996 width=208) (actual time=163.054..172.543 rows=5810 loops=9)
                             Sort name: p.name, p.date, p.parent
                             Sort Method:  external merge  Disk: 1304kB
-> Seq Scan on par p (cost=0.00..237.96 rows=5996 width=208) (actual time=0.015..3.330 rows=5996 loops=9)
 Total runtime: 2547.503 ms
(17 rows)



QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
CTE Scan on t (cost=927.80..928.10 rows=1 width=144) (actual time=0.036..0.132 rows=1 loops=1) Filter: (((name)::text = 'cfx'::text) AND (date = '2009-08-19'::date) AND (tid = 28340))
   CTE t
-> Recursive Union (cost=0.00..927.80 rows=11 width=212) (actual time=0.030..0.124 rows=1 loops=1) -> Index Scan using par_id on par (cost=0.00..8.27 rows=1 width=208) (actual time=0.024..0.026 rows=1 loops=1)
                 Index Cond: (id = 28340)
Filter: (((name)::text = 'cfx'::text) AND (date = '2009-08-19'::date)) -> Nested Loop (cost=0.00..91.93 rows=1 width=212) (actual time=0.091..0.091 rows=0 loops=1) Join Filter: (((t.name)::text = (p.name)::text) AND (t.date = p.date)) -> WorkTable Scan on t (cost=0.00..0.20 rows=10 width=44) (actual time=0.001..0.001 rows=1 loops=1) -> Index Scan using par_parent on par p (cost=0.00..9.07 rows=6 width=208) (actual time=0.085..0.085 rows=0 loops=1)
                       Index Cond: (p.parent = t.id)
 Total runtime: 0.221 ms
(13 rows)



books=# \d _books.par
           Table "_books.par"
    Column    |       Type        | Modifiers
--------------+-------------------+-----------
 name         | character varying | not null
 date         | date              | not null
 id           | integer           | not null
 text         | character varying |
 h_title      | character varying |
 h_name       | character varying |
 parent       | integer           |
Indexes:
    "par_pkey" PRIMARY KEY, btree (name, date, id)
    "par_name" btree (name)
    "par_name_fpub_parent" btree (name, date, parent)
    "par_id" btree (id)
    "par_parent" btree (parent)



$ psql --version
psql (PostgreSQL) 8.4.4
contains support for command-line editing




--
Octavio.

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux