On Sun, Mar 20, 2011 at 11:20 PM, Adam Tistler <atistler@xxxxxxxxx> wrote: > logicops2=# explain analyze select count(*) from nodes where node_id = any( Array(select node_id from nodes limit 100000) ); > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=1718.59..1718.60 rows=1 width=0) (actual time=509.126..509.127 rows=1 loops=1) > InitPlan 1 (returns $0) > -> Limit (cost=0.00..1637.04 rows=100000 width=4) (actual time=0.010..76.604 rows=100000 loops=1) > -> Seq Scan on nodes (cost=0.00..12355.41 rows=754741 width=4) (actual time=0.008..38.105 rows=100000 loops=1) > -> Bitmap Heap Scan on nodes (cost=42.67..81.53 rows=10 width=0) (actual time=447.274..484.283 rows=100000 loops=1) > Recheck Cond: (node_id = ANY ($0)) > -> Bitmap Index Scan on n_node_id_index (cost=0.00..42.67 rows=10 width=0) (actual time=447.074..447.074 rows=100000 loops=1) > Index Cond: (node_id = ANY ($0)) > Total runtime: 509.209 ms > (9 rows) > > Time: 510.009 ms > > > logicops2=# explain analyze select count(*) from nodes where node_id in (select node_id from nodes limit 100000); > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=3017.17..3017.18 rows=1 width=0) (actual time=1052.866..1052.866 rows=1 loops=1) > -> Nested Loop (cost=2887.04..3016.67 rows=200 width=0) (actual time=167.310..1021.540 rows=100000 loops=1) > -> HashAggregate (cost=2887.04..2889.04 rows=200 width=4) (actual time=167.198..251.205 rows=100000 loops=1) > -> Limit (cost=0.00..1637.04 rows=100000 width=4) (actual time=0.008..80.090 rows=100000 loops=1) > -> Seq Scan on nodes (cost=0.00..12355.41 rows=754741 width=4) (actual time=0.007..41.566 rows=100000 loops=1) > -> Index Scan using n_node_id_index on nodes (cost=0.00..0.63 rows=1 width=4) (actual time=0.006..0.007 rows=1 loops=100000) > Index Cond: (public.nodes.node_id = public.nodes.node_id) > Total runtime: 1053.523 ms > (8 rows) > > Time: 1054.864 ms This is a pretty interesting example. I think this is just an optimizer limitation. When trying to build a join tree (in this case, between the copy of nodes inside the subselect and the copy outside the subselect), the planner considers three main join strategies: hash join, nested loop, merge join. A merge or hash join will have to read the outside-the-subselect copy of nodes in its entirety (I think); the only way to avoid that is to compute the subselect first and then use the index probes to pull out just the matching rows. That's what the planner did in both cases, but in the second case it's not smart enough to see that it can gather up all the values from the inner side of the join and shove them into a bitmap index scan all at once, so it just uses a regular index scan to pull 'em out one at a time. I think this would be pretty tricky to support, since the join node would need to understand all the parameter passing that needs to happen between the inner and outer sides of the loop; it's almost like a whole new join type. You might also want to make the opposite transformation, turning the first plan into the second one, if (for example) the subselect is going to return a gigabyte of data. But we're just not that smart. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance