Re: incorrect row estimates for primary key join

Tom Lane <tgl@xxxxxxxxxxxxx> · Tue, 25 Jun 2013 19:36:17 -0400

Kevin Grittner <kgrittn@xxxxxxxxx> writes:
> Ben <midfield@xxxxxxxxx> wrote:
>> it seems to me that an equality join between two relations (call them A and B)
>> using columns in relation B with a unique constraint should yield row estimates
>> which are at most equal to the row estimates for relation A.  my questions are
>> 
>> 1 - is this correct?
>> 
>> 2 - does the postgresql planner implement this when generating row estimates?

> That seems intuitive, but some of the estimates need to be made
> before all such information is available.  Maybe we can do
> something about that some day....
> Maybe someone else will jump in here with more details than I can
> provide (at least without hours digging in the source code).

It does not attempt to match up query WHERE clauses with indexes during
selectivity estimation, so the existence of a multi-column unique
constraint wouldn't help it improve the estimate.

In the case at hand, I doubt that a better result rowcount estimate
would have changed the planner's opinion of how to do the join.  The OP
seems to be imagining that 2 million index probes into a large table
would be cheap, but that's hardly true.  It's quite likely that the
mergejoin actually is the best way to do the query.  If it isn't really
best on his hardware, I would think that indicates a need for some
tuning of the cost parameters.  Another thing that might be helpful for
working with such large tables is increasing work_mem, to make hashes
and sorts run faster.

			regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance