I found in a production system that mostly the performance is being crippled by the handling of one particular case. I hope I've reduced this to a minimal example, below. Summary: when more than one key is given (or the data is sourced from a table) the planner is forced to a join of the whole data set. The query aggregates a small amount of the data in a large data set. It looks like lookups via a Nested Loop would be considerably quicker. I did this explicitly in the UNION query. What is that prevents the index condition from being used in earlier parts of the query? Only where a single condition is present is it be used below the final join. Is it that a Nested Loop cannot make several independent lookups via an index? Is my input preventing the planner doing this, or would it need to be smarter about something? It seems interesting that it is able to do this successfully in the third plan: "As above but without the join to the job table". Thanks -- Mark -- -- Create schema: job -E task -E resource -- CREATE TABLE job ( jid integer PRIMARY KEY ); CREATE TABLE task ( jid integer REFERENCES job (jid), tid integer PRIMARY KEY ); CREATE TABLE resource ( tid integer REFERENCES task (tid), name text ); CREATE INDEX idx_task ON task (jid, tid); CREATE INDEX idx_resource_tid ON resource (tid, name); CREATE INDEX idx_resource_name ON resource (name, tid); -- -- Populate with data: -- 10000 jobs, -- 1000 tasks per job, -- 0-4 resources per task -- CREATE OR REPLACE FUNCTION populate() RETURNS VOID AS $$ DECLARE t integer; BEGIN FOR t IN 0..10000 LOOP INSERT INTO job VALUES (t); END LOOP; FOR t IN 0..10000000 LOOP INSERT INTO task VALUES (random() * 10000, t); IF random() > 0.1 THEN INSERT INTO resource VALUES (t, 'wallace'); INSERT INTO resource VALUES (t, 'gromit'); END IF; IF random() > 0.9 THEN INSERT INTO resource VALUES (t, 'shaun'); END IF; IF random() > 0.6 THEN INSERT INTO resource VALUES (t, 'wendolene'); END IF; END LOOP; END $$ LANGUAGE plpgsql; SELECT populate(); VACUUM ANALYZE; -- Define some simple aggregation with a left join CREATE VIEW middle AS SELECT task.jid, task.tid, COUNT(resource.name) AS nresource FROM task LEFT JOIN resource ON task.tid = resource.tid GROUP BY task.jid, task.tid; -- Aggregate again for a single key: fast -- "Nested Loop" is used SELECT job.jid, sum(nresource) FROM job INNER JOIN middle ON job.jid = middle.jid WHERE job.jid IN (1234) GROUP BY job.jid; -- GroupAggregate (cost=0.00..35026.04 rows=1 width=12) -- -> Nested Loop (cost=0.00..35021.13 rows=980 width=12) -- -> Index Only Scan using job_pkey on job (cost=0.00..4.28 rows=1 width=4) -- Index Cond: (jid = 1234) -- -> GroupAggregate (cost=0.00..34997.25 rows=980 width=15) -- -> Nested Loop Left Join (cost=0.00..34970.55 rows=2254 width=15) -- -> Index Only Scan using idx_task on task (cost=0.00..55.98 rows=980 width=8) -- Index Cond: (jid = 1234) -- -> Index Only Scan using idx_resource_tid on resource (cost=0.00..35.54 rows=7 width=11) -- Index Cond: (tid = task.tid) -- (10 rows) -- As above, but with two keys: slow -- "Merge Join" is attempted; this is the 'bad' case EXPLAIN SELECT job.jid, sum(nresource) FROM job INNER JOIN middle ON job.jid = middle.jid WHERE job.jid IN (1234, 5678) GROUP BY job.jid; -- GroupAggregate (cost=5636130.95..6091189.12 rows=2 width=12) -- -> Merge Join (cost=5636130.95..6091179.10 rows=2000 width=12) -- Merge Cond: (task.jid = job.jid) -- -> GroupAggregate (cost=5636130.95..5966140.73 rows=9999986 width=15) -- -> Sort (cost=5636130.95..5693633.43 rows=23000992 width=15) -- Sort Key: task.jid, task.tid -- -> Merge Left Join (cost=0.00..1251322.49 rows=23000992 width=15) -- Merge Cond: (task.tid = resource.tid) -- -> Index Scan using task_pkey on task (cost=0.00..281847.80 rows=9999986 width=8) -- -> Index Only Scan using idx_resource_tid on resource (cost=0.00..656962.32 rows=23000992 width=11) -- -> Materialize (cost=0.00..8.55 rows=2 width=4) -- -> Index Only Scan using job_pkey on job (cost=0.00..8.54 rows=2 width=4) -- Index Cond: (jid = ANY ('{1234,5678}'::integer[])) -- (13 rows) -- As above but without the join to the job table: fast SELECT jid, sum(nresource) FROM middle WHERE jid IN (1234, 5678) GROUP BY jid; -- GroupAggregate (cost=0.00..69995.03 rows=200 width=12) -- -> GroupAggregate (cost=0.00..69963.62 rows=1961 width=15) -- -> Nested Loop Left Join (cost=0.00..69910.18 rows=4511 width=15) -- -> Index Only Scan using idx_task on task (cost=0.00..93.39 rows=1961 width=8) -- Index Cond: (jid = ANY ('{1234,5678}'::integer[])) -- -> Index Only Scan using idx_resource_tid on resource (cost=0.00..35.52 rows=7 width=11) -- Index Cond: (tid = task.tid) -- (7 rows) -- Kludge to lookup two keys: fast (cost 70052) SELECT job.jid, sum(nresource) FROM job INNER JOIN middle ON job.jid = middle.jid WHERE job.jid IN (1234) GROUP BY job.jid UNION SELECT job.jid, sum(nresource) FROM job INNER JOIN middle ON job.jid = middle.jid WHERE job.jid IN (5678) GROUP BY job.jid; -- -- Repeat with job keys from a table instead of 'IN' clause. -- CREATE TABLE one_job ( jid integer PRIMARY KEY ); CREATE TABLE two_jobs ( jid integer PRIMARY KEY ); INSERT INTO one_job VALUES (1234); INSERT INTO two_jobs VALUES (1234), (5678); ANALYZE one_job; ANALYZE two_jobs; -- Joining against one row: slow (cost 5636131.97..6092141.59) -- "Merge Join" is attempted EXPLAIN SELECT job.jid, sum(nresource) FROM one_job job INNER JOIN middle ON job.jid = middle.jid GROUP BY job.jid; -- Joining against two rows: slow (cost 5636131.98..6093141.60) -- "Merge Join" is attempted EXPLAIN SELECT job.jid, sum(nresource) FROM two_jobs job INNER JOIN middle ON job.jid = middle.jid GROUP BY job.jid; -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance