How about trying: Select * From (Select * from t28 where t28.0='spec') t28a Left out join (t1 JOIN t11 ON > (t11.o = '<http://example.org>' AND t11.s = t1.o)) ON t28a.s = t1.s In this way, I think, the where clause on t28 would be performed before the join rather than after. Jonathan Blitz > -----Original Message----- > From: Aaron Birkland [mailto:birkie@xxxxxxxxx] > Sent: Sunday, December 03, 2006 5:12 PM > To: pgsql-performance@xxxxxxxxxxxxxx > Subject: [PERFORM] Propagating outer join conditions > > The following left outer join plan puzzles me: > > EXPLAIN ANALYZE SELECT * from t28 LEFT OUTER JOIN (t1 JOIN t11 ON > (t11.o = '<http://example.org>' AND t11.s = t1.o)) ON t28.s = t1.s > WHERE t28.o = '"spec"'; > > t28, t1, and t11 all have indexed columns named 's' and 'o' that contain 'text'; > > Nested Loop Left Join (cost=794249.26..3289704.61 rows=1 width=301) > (actual time=581293.390..581293.492 rows=1 loops=1) > Join Filter: (t28.s = t1.s) > -> Index Scan using t28_o on t28 (cost=0.00..9.22 rows=1 > width=89) (actual time=0.073..0.077 rows=1 loops=1) > Index Cond: (o = '"spec"'::text) > -> Merge Join (cost=794249.26..3267020.66 rows=1813979 width=212) > (actual time=230365.522..577078.266 rows=1894969 loops=1) > Merge Cond: (t1.o = t11.s) > -> Index Scan using t1_o on t1 (cost=0.00..2390242.10 > rows=22225696 width=109) (actual time=0.209..162586.801 rows=22223925 > loops=1) > -> Sort (cost=794249.26..798784.21 rows=1813979 width=103) > (actual time=230365.175..237409.474 rows=1894969 loops=1) > Sort Key: t11.s > -> Bitmap Heap Scan on t11 (cost=78450.82..605679.55 > rows=1813979 width=103) (actual time=3252.103..22782.271 rows=1894969 > loops=1) > Recheck Cond: (o = '<http://example.org>'::text) > -> Bitmap Index Scan on t11_o > (cost=0.00..78450.82 rows=1813979 width=0) (actual > time=2445.422..2445.422 rows=1894969 loops=1) > Index Cond: (o = '<http://example.org>'::text) > > > It seems to me that this plan is not very desirable, since the outer > part of the nested loop left join (the merge join node) is very > expensive. Is is possible to generate a plan that looks like this: > > Nested Loop Left Join (cost=???) > -> Index Scan using t28_o on t28 (cost=0.00..9.11 rows=1 width=89) > Index Cond: (o = '"spec"'::text) > -> Nested Loop (cost=???) > -> Index Scan using t1_s on t1 (cost=???) > Index Cond: (s = t28.s) > -> Bitmap Heap Scan on t11 (cost=???) > Recheck Cond: (t11.s = t1.o) > Filter: (o = '<http://example.org>'::text) > -> Bitmap Index Scan on t11_s (cost=??? ) > Index Cond: (t11.s = t1.o) > > I *think* this plan is equivalent to the above if I'm assuming the > behaviour of the 'nested loop left join' node correctly. So far, I > have been tweaking the statistics, cost estimates, and > enabling.disabling certain plans to see if I can get it to propagate > the join condition t1.s = t28.s to the outer node of the left join.. > but so far, I cannot. So, my questions are: > > 1) Is my 'desired' query plan logic correct > 2) Can the executor execute a plan such as my 'desired' plan > 3) If (1) and (2) are 'yes', then how may I get the planner to > generate such a plan, or do I just need to look harder into tweaking > the statistics and cost estimates > > -Aaron > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match > > > -- > No virus found in this incoming message. > Checked by AVG Free Edition. > Version: 7.1.409 / Virus Database: 268.15.6/565 - Release Date: 12/2/2006 > > > -- > No virus found in this incoming message. > Checked by AVG Free Edition. > Version: 7.1.409 / Virus Database: 268.15.6/565 - Release Date: 12/02/2006 > -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.409 / Virus Database: 268.15.6/565 - Release Date: 12/02/2006