On 16/05/13 03:52, Heikki Linnakangas
wrote:
On
15.05.2013 18:31, Shaun Thomas wrote:
I've seen conversations on this since at
least 2005. There were even
proposed patches every once in a while, but never any consensus.
Anyone
care to comment?
Well, as you said, there has never been any consensus.
There are basically two pieces to the puzzle:
1. What metric do you use to represent correlation between
columns?
2. How do use collect that statistic?
Based on the prior discussions, collecting the stats seems to be
tricky. It's not clear for which combinations of columns it should
be collected (all possible combinations? That explodes
quickly...), or how it can be collected without scanning the whole
table.
I think it would be pretty straightforward to use such a
statistic, once we have it. So perhaps we should get started by
allowing the DBA to set a correlation metric manually, and use
that in the planner.
- Heikki
How about pg comparing actual numbers
of rows delivered with the predicted number - and
if a specified threshold is reached, then maintaining statistics?
There is obviously more to it, such as: is this a relevant query to
consider & the size of the tables (no point in attempting to
optimise tables with only 10 rows for example).
Cheers,
Gavin
|