Tom Lane wrote:
What the stats entry is saying is that the most common entries occur
about 75000 times apiece (0.00166667 * 45e6), which is what's scaring
the planner here ;-). I think those frequencies are artificially high
though. The default statistics sample size is 3000 rows (300 *
statistics target, actually), so those numbers correspond to 5 or 4
rows in the sample, which is probably just random chance.
Try increasing the stats targets for this table to 100, then re-ANALYZE
and see what you get. The most_common_freqs entries might drop as much
as a factor of 10.
regards, tom lane
Tom:
I believe this was the problem. I upped the statistics to 100, for a
sample size of 30k and now the planner does the correct nested
loop/index scan and takes only 30 seconds! This is a HUGE performance
increase.
I wonder why the estimates were so far off the first time? This table
has been ANALYZED regularly ever since creation.
Once again, thank you and all of the developers for your hard work on
PostgreSQL. This is by far the most pleasant management experience of
any database I've worked on.
-Dan