"henk de wit" <henk53602@xxxxxxxxxxx> writes: > In that case the prediction is 2 rows, which is only 1 row more than in the > previous case. Yet the plan is much better and performance improved > dramatically. Is there a reason/explanation for that? Well, it's just an estimated-cost comparison. If there's only one row then a nestloop join looks like the best way since it requires no extra overhead. But as soon as you get to two rows, the other side of the join would have to be executed twice, and that's more expensive than doing it once and setting up a hash table. In the actual event, with 359 rows out of the scan, the nestloop way is just horrid because it repeats the other side 359 times :-( It strikes me that it might be interesting to use a minimum rowcount estimate of two rows, not one, for any case where we can't actually prove there is at most one row (ie, the query conditions match a unique index). That is probably enough to discourage this sort of brittle behavior ... though no doubt there'd still be cases where it's the wrong thing. We do not actually have any code right now to make such proofs, but there's been some discussion recently about adding such logic in support of removing useless outer joins. >> FWIW, CVS HEAD does get rid of the duplicate conditions for the common >> case of mergejoinable equality operators --- but it's not explicitly >> looking for duplicate conditions, rather this is falling out of a new >> method for making transitive equality deductions. > This sounds very interesting Tom. Is there some documentation somewhere > where I can read about this new method? Check the archives for mention of equivalence classes, notably these two threads: http://archives.postgresql.org/pgsql-hackers/2007-01/msg00568.php http://archives.postgresql.org/pgsql-hackers/2007-01/msg00826.php regards, tom lane