Folks, Yet more evidence that we need some way to assess query plans which are high-risk and avoid them (or have Yet Another GUC): Merge Join (cost=29.16..1648.00 rows=382 width=78) (actual time=57215.167..57215.216 rows=1 loops=1) Merge Cond: (rn.node_id = device_nodes.node_id) -> Nested Loop (cost=0.00..11301882.40 rows=6998 width=62) (actual time=57209.291..57215.030 rows=112 loops=1) Join Filter: (node_ep.node_id = rn.node_id) -> Nested Loop (cost=0.00..11003966.85 rows=90276 width=46) (actual time=0.027..52792.422 rows=90195 loops=1) -> Index Scan using ix_ne_ns on node_ep (cost=0.00..1545943.45 rows=32606992 width=26) (actual time=0.010..7787.043 rows=32606903 loops=1) -> Index Scan using ix_nefp_eid on ep_fp (cost=0.00..0.28 rows=1 width=20) (actual time=0.001..0.001 rows=0 loops=32606903) Index Cond: (ep_fp.ep_id = node_ep.ep_id) -> Materialize (cost=0.00..5.30 rows=220 width=16) (actual time=0.000..0.019 rows=220 loops=90195) -> Seq Scan on mytable rn (cost=0.00..4.20 rows=220 width=16) (actual time=0.008..0.043 rows=220 loops=1) -> Sort (cost=28.18..28.21 rows=12 width=16) (actual time=0.164..0.165 rows=10 loops=1) Sort Key: device_nodes.node_id Sort Method: quicksort Memory: 25kB -> Index Scan using ix_dn_did on device_nodes (cost=0.00..27.96 rows=12 width=16) (actual time=0.086..0.134 rows=10 loops=1) Index Cond: (dev_id = 18165) Total runtime: 57215.329 ms AFAICT, what's happening in this query is that PostgreSQL's statistics on the device_nodes and several other tables are slightly out of date (as in 5% of the table). Thus it thinks that nothing will match the list of node_ids in "mytable", and that it can exit the merge join early and ignore the whole huge cost of the join plan. This particular form of out-of-dateness will be fixed in 9.1 (it's due to values being higher than the highest histogram bucket in pg_stat), but not all forms will be. It really seems like we should be able to detect an obvious high-risk situation like this one. Or maybe we're just being too optimistic about discarding subplans? BTW, the optimal plan for this query (post-analyze) is this one: Nested Loop (cost=0.00..213068.26 rows=12 width=78) (actual time=0.374..0.514 rows=1 loops=1) Join Filter: (device_nodes.node_id = rn.node_id) -> Seq Scan on mytable rn (cost=0.00..4.20 rows=220 width=16) (actual time=0.013..0.050 rows=220 loops=1) -> Materialize (cost=0.00..213024.49 rows=12 width=62) (actual time=0.001..0.002 rows=1 loops=220) -> Nested Loop (cost=0.00..213024.43 rows=12 width=62) (actual time=0.077..0.278 rows=1 loops=1) -> Nested Loop (cost=0.00..211740.04 rows=4428 width=42) (actual time=0.070..0.269 rows=1 loops=1) -> Index Scan using ix_dn_did on device_nodes (cost=0.00..51.92 rows=13 width=16) (actual time=0.058..0.115 rows=10 loops=1) Index Cond: (dev_id = 18165) -> Index Scan using ix_ne_ns on node_ep (cost=0.00..16137.45 rows=11700 width=26) (actual time=0.014..0.014 rows=0 loops=10) Index Cond: (node_ep.node_id = device_nodes.node_id) -> Index Scan using ix_nefp_eid on ep_fp (cost=0.00..0.28 rows=1 width=20) (actual time=0.006..0.007 rows=1 loops=1) Index Cond: (ep_fp.ep_id = node_ep.ep_id); -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance