On Sep 28, 2006, at 8:51 PM, Tom Lane wrote:
[..]
The information we've seen says that the only statistically
reliable way
to arrive at an accurate n_distinct estimate is to examine most of the
table :-(. Which seems infeasible for extremely large tables,
which is
exactly where the problem is worst. Marginal increases in the sample
size seem unlikely to help much ... as indeed your experiment shows.
I think a first step might be to introduce a new analyze command,
such as ANALYZE FULL. This would be executed deliberately (IOW not by
autovacuum) like CLUSTER or VACUUM FULL when deemed necessary by the
dba. The command as implied would scan the entire table and fill in
the stats based on that (as analyze used to do IIRC). It would also
be useful if this command froze the stats so that autovacuum didn't
clobber them with inaccurate ones shortly thereafter. Perhaps an
explicit ANALYZE FULL FREEZE command would be useful for that case,
the behavior being that a normal ANALYZE would not overwrite the
stats for a stats-frozen table, another ANALYZE FULL would, however.
Such a frozen state would also be useful if you wanted to hand-tweak
stats for a single table and have it stick and still use autovac. As
I understand it now, with autovac on, you cannot do that unless you
hack the pg_autovacuum table (i.e., set anl_base_thresh to an
artificially high value).
Another option (that I think others have suggested) would be to make
this the behavior for VACUUM ANALYZE. That saves the baggage of a new
command at least. Another advantage would be that the autovac daemon
could run it. Perhaps some smarts could also be built in. What if
VACUUM ANALYZE first runs a normal (sampled) ANALYZE. Then it
performs the VACUUM with full ANALYZE pass. The stats gathered by the
latter full pass are compared to that of the first sampled pass. If
the full ANALYZE statistics are sufficiently different from the
sampled pass, then the table is flagged so that normal ANALYZE is not
performed by the autovac daemon on that table. Also, a global ANALYZE
could ignore it (though this seems more magical).
A more pie-in-the-sky idea could take advantage of the fact that the
larger a table is the less likely the statistics will change much
over time. If we cannot afford to sample many rows in a given analyze
pass, then perhaps we should use a "newton's method" approach where
we attempt to converge on an accurate value over time with each
analyze pass contributing more samples to the statistics and honing
them incrementally rather than simply replacing the old ones.
I'm not statistician, so it's not clear to me how much more state you
would need to keep between analyze passes to make this viable, but in
order for this to work the following would need to be true:
1) Analyze would need to be run on a regular basis (luckily we have
autovaccum to help). You would want to analyze this table
periodically even if nothing much changed, however. Perhaps tuning
the autovac parameters is enough here.
2) Each analyze pass would need to sample randomly so that multiple
passes tend to sample different rows.
3) The stats would need to somehow be cumulative. Perhaps this means
storing sample values between passes, or some other statistical voodoo.
4) Needs to be smart enough to realize when a table has changed
drastically, and toss out the old stats in this case. Either that or
we require a human to tell us via ANALYZE FULL/VACUUM ANALYZE.
I think that the incremental stats approach would more or less depend
on the full ANALYZE functionality for bootstrapping. I think when you
first load the table, you want to get the stats right immediately and
not wait some indeterminate amount of time for them to "converge" on
the right value.
-Casey