Search Postgresql Archives

Bad query? Or planner?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I have recently started using postgres and have been operating under the assumption that the query planner knows best (as long as I don't do anything too stupid). I've been structuring my queries (and data) in a certain way: writing re-usable subqueries, joining them in as necessary, and only filtering the query at the top level (assuming that the query planner will push down the appropriate restrictions as necessary). Of course, also keeping in mind proper indexes to support efficient joins and sorts.

I recently wrote a query that I thought was easy to reason about, and I assumed the query planner would execute it efficiently.

SELECT * FROM xtag_stack_feed
JOIN (
  SELECT DISTINCT ON (do_post_xtag.xtag_ci) * 
  FROM do_post_xtag
  JOIN do_post ON do_post.rc_id=do_post_xtag.post_rc_id
  ORDER BY do_post_xtag.xtag_ci ASC, do_post_xtag.post_rc_id DESC) last_post
ON last_post.xtag_ci=xtag_stack_feed.xtag_ci
ORDER BY decayed_to_base DESC
LIMIT 1;

Unfortunately, the query as written is not being executed efficiently. I tried to rewrite it in a couple different ways without success, and then learned about lateral joins. Rewritten as follows, it executes efficiently.

SELECT * FROM xtag_stack_feed
JOIN LATERAL (
  SELECT DISTINCT ON (do_post_xtag.xtag_ci) *
  FROM do_post_xtag
  JOIN do_post ON do_post.rc_id=do_post_xtag.post_rc_id
  WHERE do_post_xtag.xtag_ci=xtag_stack_feed.xtag_ci
  ORDER BY do_post_xtag.xtag_ci ASC, do_post_xtag.post_rc_id DESC) last_post
ON true
ORDER BY decayed_to_base DESC
LIMIT 1;

From my naive perspective, it seems like the second query is semantically equivalent to the first; it just has the join condition moved into the subquery as a WHERE filter.

Am I doing something wrong? Is there room for query planner improvement in cases like these?

I've attached the EXPLAIN ANALYZEs. Any help would be much appreciated!

Thanks,
-Devin

v3db=# SELECT version();
                                         version
------------------------------------------------------------------------------------------
 PostgreSQL 9.5.4 on x86_64-pc-linux-gnu, compiled by gcc (Debian 4.9.2-10) 4.9.2, 64-bit
(1 row)

v3db=# EXPLAIN ANALYZE SELECT * FROM xtag_stack_feed
v3db-# JOIN (
v3db(#   SELECT DISTINCT ON (do_post_xtag.xtag_ci) *
v3db(#   FROM do_post_xtag
v3db(#   JOIN do_post ON do_post.rc_id=do_post_xtag.post_rc_id
v3db(#   ORDER BY do_post_xtag.xtag_ci ASC, do_post_xtag.post_rc_id DESC) last_post
v3db-# ON last_post.xtag_ci=xtag_stack_feed.xtag_ci
v3db-# ORDER BY decayed_to_base DESC
v3db-# LIMIT 1;
                                                                                   QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=61553.49..64173.98 rows=1 width=337) (actual time=5006.573..5006.574 rows=1 loops=1)
   ->  Nested Loop  (cost=61553.49..58453893.39 rows=22283 width=337) (actual time=5006.571..5006.571 rows=1 loops=1)
         Join Filter: (xtag_stack_feed.xtag_ci = last_post.xtag_ci)
         Rows Removed by Join Filter: 25973
         ->  Index Scan using xtag_stack_feed_decayed_to_base_xtag_ci_idx on xtag_stack_feed  (cost=0.41..5293.44 rows=48443 width=43) (actual time=0.088..0.088 rows=1 loops=1)
         ->  Materialize  (cost=61553.07..63843.13 rows=22283 width=294) (actual time=4041.034..4988.338 rows=25974 loops=1)
               ->  Subquery Scan on last_post  (cost=61553.07..62860.71 rows=22283 width=294) (actual time=4041.025..4967.023 rows=25974 loops=1)
                     ->  Unique  (cost=61553.07..62637.88 rows=22283 width=294) (actual time=4041.022..4960.130 rows=25974 loops=1)
                           ->  Sort  (cost=61553.07..62095.48 rows=216962 width=294) (actual time=4041.019..4837.741 rows=124511 loops=1)
                                 Sort Key: do_post_xtag.xtag_ci, do_post_xtag.post_rc_id COLLATE "C" DESC
                                 Sort Method: external merge  Disk: 67920kB
                                 ->  Hash Join  (cost=1591.71..12656.56 rows=216962 width=294) (actual time=27.422..230.173 rows=216965 loops=1)
                                       Hash Cond: ((do_post_xtag.post_rc_id)::bpchar = (do_post.rc_id)::bpchar)
                                       ->  Seq Scan on do_post_xtag  (cost=0.00..4098.62 rows=216962 width=36) (actual time=0.007..29.305 rows=216965 loops=1)
                                       ->  Hash  (cost=790.87..790.87 rows=16787 width=258) (actual time=27.136..27.136 rows=16788 loops=1)
                                             Buckets: 16384  Batches: 2  Memory Usage: 2269kB
                                             ->  Seq Scan on do_post  (cost=0.00..790.87 rows=16787 width=258) (actual time=0.004..5.972 rows=16788 loops=1)
 Planning time: 3.842 ms
 Execution time: 5021.670 ms
(19 rows)

v3db=# EXPLAIN ANALYZE SELECT * FROM xtag_stack_feed
v3db-# JOIN LATERAL (
v3db(#   SELECT DISTINCT ON (do_post_xtag.xtag_ci) *
v3db(#   FROM do_post_xtag
v3db(#   JOIN do_post ON do_post.rc_id=do_post_xtag.post_rc_id
v3db(#   WHERE do_post_xtag.xtag_ci=xtag_stack_feed.xtag_ci
v3db(#   ORDER BY do_post_xtag.xtag_ci ASC, do_post_xtag.post_rc_id DESC) last_post
v3db-# ON true
v3db-# ORDER BY decayed_to_base DESC
v3db-# LIMIT 1;
                                                                                   QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=126.19..189.18 rows=1 width=337) (actual time=10.199..10.200 rows=1 loops=1)
   ->  Nested Loop  (cost=126.19..6102745.52 rows=96886 width=337) (actual time=10.196..10.196 rows=1 loops=1)
         ->  Index Scan using xtag_stack_feed_decayed_to_base_xtag_ci_idx on xtag_stack_feed  (cost=0.41..5293.44 rows=48443 width=43) (actual time=0.016..0.016 rows=1 loops=1)
         ->  Unique  (cost=125.78..125.83 rows=2 width=294) (actual time=10.172..10.172 rows=1 loops=1)
               ->  Sort  (cost=125.78..125.80 rows=10 width=294) (actual time=10.170..10.170 rows=1 loops=1)
                     Sort Key: do_post_xtag.post_rc_id COLLATE "C" DESC
                     Sort Method: quicksort  Memory: 621kB
                     ->  Nested Loop  (cost=4.79..125.61 rows=10 width=294) (actual time=0.969..7.945 rows=1231 loops=1)
                           ->  Bitmap Heap Scan on do_post_xtag  (cost=4.50..42.46 rows=10 width=36) (actual time=0.958..2.835 rows=1231 loops=1)
                                 Recheck Cond: (xtag_ci = xtag_stack_feed.xtag_ci)
                                 Heap Blocks: exact=893
                                 ->  Bitmap Index Scan on do_post_xtag_xtag_ci_post_rc_id_idx  (cost=0.00..4.50 rows=10 width=0) (actual time=0.833..0.833 rows=1231 loops=1)
                                       Index Cond: (xtag_ci = xtag_stack_feed.xtag_ci)
                           ->  Index Scan using pk_do_post on do_post  (cost=0.29..8.30 rows=1 width=258) (actual time=0.003..0.003 rows=1 loops=1231)
                                 Index Cond: ((rc_id)::bpchar = (do_post_xtag.post_rc_id)::bpchar)
 Planning time: 0.653 ms
 Execution time: 10.324 ms
(17 rows)
-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux