Hi all, I'm encountering very poor query plans when joining against a union, where one branch of the union is itself a join: specifically, Postgres does the entire inside join and then filters the result, rather than pushing the filters down to the joined tables. I've provided a standalone test case below, including some variations that don't show the problem for comparison. The fourth query is the problematic one - I would have expected the Append portion of the plan to be essentially the same as the one in the second query. I'm running Postgres 9.2.3 on Oracle Enterprise Linux 5 (some machines 5.3, some 5.6) x86_64, installed using the RPMs from http://yum.pgrpms.org/. The problem occurs with identical results (other than raw speed) on both a rather underpowered test server with default Postgres settings and a much better live server which I've attempted to tune somewhat sensibly. Thanks for any advice. START TRANSACTION; CREATE TABLE bundle_contents( item_id INTEGER PRIMARY KEY, bundle_type INTEGER NOT NULL ); INSERT INTO bundle_contents(item_id, bundle_type) SELECT i * 10 + j, i FROM GENERATE_SERIES(1, 100) i, GENERATE_SERIES(1, 10) j; CREATE INDEX ON bundle_contents(bundle_type, item_id); ANALYSE bundle_contents; CREATE TABLE bundle( bundle_id INTEGER PRIMARY KEY, bundle_type INTEGER NOT NULL ); INSERT INTO bundle(bundle_id, bundle_type) SELECT i * 1000 + j, i FROM GENERATE_SERIES(1, 100) i, GENERATE_SERIES(1, 1000) j; CREATE INDEX ON bundle(bundle_type, bundle_id); ANALYSE bundle; CREATE VIEW bundled_item AS SELECT bundle_id AS item_id_a, item_id AS item_id_b FROM bundle NATURAL JOIN bundle_contents; CREATE TABLE unbundled_item( item_id_a INTEGER, item_id_b INTEGER, PRIMARY KEY (item_id_a, item_id_b) ); INSERT INTO unbundled_item(item_id_a, item_id_b) SELECT i, 1 FROM GENERATE_SERIES(1001, 100000, 10) i; INSERT INTO unbundled_item(item_id_a, item_id_b) SELECT i, 2 FROM GENERATE_SERIES(1001, 100000, 20) i; INSERT INTO unbundled_item(item_id_a, item_id_b) SELECT i, 3 FROM GENERATE_SERIES(1001, 100000, 23) i; INSERT INTO unbundled_item(item_id_a, item_id_b) SELECT i, 4 FROM GENERATE_SERIES(1001, 100000, 41) i; CREATE INDEX ON unbundled_item(item_id_b); ANALYSE unbundled_item; CREATE VIEW item AS SELECT item_id_a, item_id_b FROM bundled_item UNION ALL SELECT item_id_a, item_id_b FROM unbundled_item; CREATE TABLE item_reference( reference_id INTEGER PRIMARY KEY, item_id_a INTEGER NOT NULL, item_id_b INTEGER NOT NULL ); INSERT INTO item_reference(reference_id, item_id_a, item_id_b) VALUES(1, 1472, 16), (2, 1299, 3); CREATE INDEX ON item_reference(item_id_a, item_id_b); CREATE INDEX ON item_reference(item_id_b); ANALYSE item_reference; -- no union, nice and fast EXPLAIN (ANALYSE, BUFFERS) SELECT * FROM bundled_item WHERE (item_id_a, item_id_b) = (1472, 16); /* http://explain.depesz.com/s/1ye QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..16.56 rows=1 width=8) (actual time=0.040..0.041 rows=1 loops=1) Join Filter: (bundle.bundle_type = bundle_contents.bundle_type) Buffers: shared hit=8 -> Index Scan using bundle_pkey on bundle (cost=0.00..8.28 rows=1 width=8) (actual time=0.017..0.017 rows=1 loops=1) Index Cond: (bundle_id = 1472) Buffers: shared hit=4 -> Index Scan using bundle_contents_pkey on bundle_contents (cost=0.00..8.27 rows=1 width=8) (actual time=0.012..0.013 rows=1 loops=1) Index Cond: (item_id = 16) Buffers: shared hit=4 Total runtime: 0.085 ms (10 rows) */ -- using the union, but querying it directly rather -- than joining to it, still fast EXPLAIN (ANALYSE, BUFFERS) SELECT * FROM item WHERE (item_id_a, item_id_b) = (1472, 16); /* http://explain.depesz.com/s/rwA QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------- Result (cost=0.00..24.84 rows=2 width=8) (actual time=0.017..0.068 rows=1 loops=1) Buffers: shared hit=6 read=3 I/O Timings: read=0.032 -> Append (cost=0.00..24.84 rows=2 width=8) (actual time=0.015..0.066 rows=1 loops=1) Buffers: shared hit=6 read=3 I/O Timings: read=0.032 -> Nested Loop (cost=0.00..16.56 rows=1 width=8) (actual time=0.015..0.017 rows=1 loops=1) Join Filter: (bundle.bundle_type = bundle_contents.bundle_type) Buffers: shared hit=6 -> Index Scan using bundle_pkey on bundle (cost=0.00..8.28 rows=1 width=8) (actual time=0.006..0.007 rows=1 loops=1) Index Cond: (bundle_id = 1472) Buffers: shared hit=3 -> Index Scan using bundle_contents_pkey on bundle_contents (cost=0.00..8.27 rows=1 width=8) (actual time=0.004..0.005 rows=1 loops=1) Index Cond: (item_id = 16) Buffers: shared hit=3 -> Index Scan using unbundled_item_item_id_b_idx on unbundled_item (cost=0.00..8.27 rows=1 width=8) (actual time=0.049..0.049 rows=0 loops=1) Index Cond: (item_id_b = 16) Filter: (item_id_a = 1472) Buffers: shared read=3 I/O Timings: read=0.032 Total runtime: 0.116 ms (21 rows) */ -- join but no union, still fast EXPLAIN (ANALYSE, BUFFERS) SELECT * FROM bundled_item NATURAL JOIN item_reference WHERE reference_id = 1; /* http://explain.depesz.com/s/oOy QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------ Nested Loop (cost=1.04..4.50 rows=1 width=12) (actual time=0.092..0.096 rows=1 loops=1) Buffers: shared hit=5 read=3 I/O Timings: read=0.038 -> Merge Join (cost=1.04..1.33 rows=1 width=16) (actual time=0.032..0.035 rows=1 loops=1) Merge Cond: (bundle_contents.item_id = item_reference.item_id_b) Buffers: shared hit=4 -> Index Scan using bundle_contents_pkey on bundle_contents (cost=0.00..43.25 rows=1000 width=8) (actual time=0.010..0.013 rows=7 loops=1) Buffers: shared hit=3 -> Sort (cost=1.03..1.04 rows=1 width=12) (actual time=0.012..0.012 rows=1 loops=1) Sort Key: item_reference.item_id_b Sort Method: quicksort Memory: 25kB Buffers: shared hit=1 -> Seq Scan on item_reference (cost=0.00..1.02 rows=1 width=12) (actual time=0.005..0.007 rows=1 loops=1) Filter: (reference_id = 1) Rows Removed by Filter: 1 Buffers: shared hit=1 -> Index Only Scan using bundle_bundle_type_bundle_id_idx on bundle (cost=0.00..3.16 rows=1 width=8) (actual time=0.056..0.057 rows=1 loops=1) Index Cond: ((bundle_type = bundle_contents.bundle_type) AND (bundle_id = item_reference.item_id_a)) Heap Fetches: 1 Buffers: shared hit=1 read=3 I/O Timings: read=0.038 Total runtime: 0.139 ms (22 rows) */ -- join to the union, slow. the conditions are pushed into the -- index scan in the non-join branch of the union, but in the join -- branch they're applied after the entire join is built EXPLAIN (ANALYSE, BUFFERS) SELECT * FROM item NATURAL JOIN item_reference WHERE reference_id = 1; /* http://explain.depesz.com/s/M73 QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=27.50..27979.82 rows=26 width=12) (actual time=3.230..554.473 rows=1 loops=1) Buffers: shared hit=452 -> Seq Scan on item_reference (cost=0.00..1.02 rows=1 width=12) (actual time=0.013..0.016 rows=1 loops=1) Filter: (reference_id = 1) Rows Removed by Filter: 1 Buffers: shared hit=1 -> Append (cost=27.50..27978.78 rows=2 width=8) (actual time=3.214..554.454 rows=1 loops=1) Buffers: shared hit=451 -> Subquery Scan on "*SELECT* 1" (cost=27.50..27970.50 rows=1 width=8) (actual time=3.212..554.422 rows=1 loops=1) Filter: ((item_reference.item_id_a = "*SELECT* 1".item_id_a) AND (item_reference.item_id_b = "*SELECT* 1".item_id_b)) Rows Removed by Filter: 999999 Buffers: shared hit=448 -> Hash Join (cost=27.50..12970.50 rows=1000000 width=8) (actual time=0.617..344.127 rows=1000000 loops=1) Hash Cond: (bundle.bundle_type = bundle_contents.bundle_type) Buffers: shared hit=448 -> Seq Scan on bundle (cost=0.00..1443.00 rows=100000 width=8) (actual time=0.009..22.066 rows=100000 loops=1) Buffers: shared hit=443 -> Hash (cost=15.00..15.00 rows=1000 width=8) (actual time=0.594..0.594 rows=1000 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 40kB Buffers: shared hit=5 -> Seq Scan on bundle_contents (cost=0.00..15.00 rows=1000 width=8) (actual time=0.008..0.207 rows=1000 loops=1) Buffers: shared hit=5 -> Index Only Scan using unbundled_item_pkey on unbundled_item (cost=0.00..8.28 rows=1 width=8) (actual time=0.021..0.021 rows=0 loops=1) Index Cond: ((item_id_a = item_reference.item_id_a) AND (item_id_b = item_reference.item_id_b)) Heap Fetches: 0 Buffers: shared hit=3 Total runtime: 554.533 ms (27 rows) */ ROLLBACK; -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance