Shaun Thomas <sthomas@xxxxxxxxx> wrote: > I was thinking more: > > JOIN foo_tab USING (foo_id) WITH (COST=50) The problem I have with that syntax is that it would be hard to read when you have some nested set of joins or a (SELECT) in the JOIN instead of simple table name. For me, at least, it would "get lost" less easily if it were right next to the JOIN keyword. The problem with a COST factor is that it's not obvious to me what it would apply to: - each row on the left? - each row on the right? - each row in the result of the JOIN step? - the entire step? How would it scale based on other criteria which affected the number of rows on either side of the join? If I'm understanding the problem correctly, the part the optimizer gets wrong (because we don't yet have statistics to support a better assumption) is assuming that selection criteria on opposite sides of a join affect entirely independent sets of what would be in the result without the criteria. To use an oft-cited example, when one table is selected by zip code and the other by city, that's a bad assumption about the correlation, leading to bad estimates, leading to bad costing, leading to bad plans. The OP wanted to override step 4, a COST setting would try to override step 3, but I think we would want to override step 1 (until we get statistics which let us compute that accurately). -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance