Scott Marlowe wrote:
Also note that the argument that autovacuum chews up too much IO is
moot now that you can set cost delay to 10 to 20 milliseconds. Unless
you're running on the hairy edge of maximum IO at all times, autovac
should be pretty much unnoticed
And if you're running on the hairy edge like that, you really need
autovacuum whether you think you can afford it or not. Badly maintained
tables are also I/O intensive, and it's easy for someone who thinks "I'm
too busy to allocate VACUUM time" to end up wasting more resources than
it would have taken to just do things right in the first place. I see
way too many people who suffer from false economy when it comes to
autovacuum planning.
Some comments on this whole discussion:
1) You don't end up with dead rows [auto]vacuum needs to clean up just
when you delete things. They show up when you UPDATE things, too--the
original row isn't removed until after the new one is written. The
overhead isn't as bad on UPDATEs in 8.3 or later, but just because you
don't delete doesn't mean you don't need VACUUM to clean up dead stuff.
2) Any time you find yourself considering VACUUM FULL to clean things
up, you're probably making a mistake, because the sort of situations
it's the only tool to recover from tend to be broader disasters. The
guidelines in
http://developer.postgresql.org/pgdocs/postgres/routine-vacuuming.html
spell out my feeling here as a tip: "the best way is to use CLUSTER or
one of the table-rewriting variants of ALTER TABLE". If you're fighting
performance issues because of some amount of background mismanagement
with an unknown amount of table garbage in the past, it's quite possible
you'll find the reinvigorated performance you get from CLUSTER worth the
maintenance cost of needing an exclusive lock for it to run for a
while. See
http://it.toolbox.com/blogs/database-soup/getting-rid-of-vacuum-full-feedback-needed-33959
for more information. I run CLUSTER all the time, and every time I
think I'm saving time by doing VACUUM FULL/REINDEX instead I regret it
as another false economy. Turn on autovacuum, make it run all the time
but at a slower average speed if you're concerned about its overhead,
and use CLUSTER once to blow away the accumulated bloat from before you
were doing the right things.
3) There is a lot of speculation here and no measurements. What I'd be
doing in this case is running something like the query at
http://wiki.postgresql.org/wiki/Disk_Usage (but without the lines that
filter out pg_catalog because the catalogs are a strong suspect here)
regularly while debugging the problem here. Measure how big all the
catalog tables and indexes are, do your operations that make things
better or worse, then measure again. Turn autovacuum on, repeat the
test, see if things are different. This type of problem tends to be
really easy to quantify.
--
Greg Smith greg@xxxxxxxxxxxxxxx Baltimore, MD
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance