> My guess is that the reason we run ANALYZE more frequently than vacuum > (with the default settings) is that ANALYZE is pretty cheap. In many > cases, if the statistical distribution of the data hasn't changed > much, then it's not really necessary, but it doesn't cost much either. > And for certain types of usage patterns, like time series (where the > maximum value keeps increasing) it's REALLY important to analyze > frequently. > > But having said that, on the systems I've worked with, I've only > rarely seen a problem caused by not analyzing frequently enough. On > the other hand, I've seen MANY problems caused by not vacuuming > enough. Which is the opposite of my experience; currently we have several clients who have issues which required more-frequent analyzes on specific tables. Before 8.4, vacuuming more frequently, especially on large tables, was very costly; vacuum takes a lot of I/O and CPU. Even with 8.4 it's not something you want to increase without thinking about the tradeoffs. Since I'm responsible for the current defaults, I though I'd explain the reasoning behind them. I developed and tested them while at Greenplum, so they are *not* designed for small databases. #autovacuum_vacuum_threshold = 50 #autovacuum_analyze_threshold = 50 These two are set to the minimum threshold to avoid having small tables get vacuum/analyzed continuously, but to make sure that small tables do get vacuumed & analyzed sometimes. #autovacuum_vacuum_scale_factor = 0.2 This is set because in my experience, 20% bloat is about the level at which bloat starts affecting performance; thus, we want to vacuum at that level but not sooner. This does mean that very large tables which never have more than 10% updates/deletes don't get vacuumed at all until freeze_age; this is a *good thing*. VACUUM on large tables is expensive; you don't *want* to vacuum a billion-row table which has only 100,000 updates. #autovacuum_analyze_scale_factor = 0.1 The 10% threshold for analyze is there because (a) analyze is cheap, and (b) 10% changes to a table can result in very bad plans if the changes are highly skewed towards a specific range, such as additions onto the end of a time-based table. The current postgres defaults were tested on DBT2 as well as pgbench, and in my last 2 years of consulting I've seldom found reason to touch them except on *specific* tables. So I still feel that they are good defaults. It would be worth doing a DBT2/DBT5 test run with different autovac settings post-8.4 so see if we should specifically change the vacuum threshold. Pending that, though, I think the current defaults are good enough. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance