Re: Enabling Autovacuum Postgres 9.1 (was Unexpectedly high disk space usage)

"Kevin Grittner" <kgrittn@xxxxxxxx> · Mon, 12 Nov 2012 16:02:48 -0500

Lists wrote:
> There's a wealth of how to tune PG instruction that's old and
> (based on this thread alone) often stale enough to be classified
> as disinformative. For example, nearest I can tell, the entirety of
> this page is just wrong and/or irrelevant for 9.x and up: 
> http://wiki.postgresql.org/wiki/VACUUM_FULL. In constrast to the
> advice to "avoid Vacuum Full" [...] comes the insight that vacuum
> full is necessary to clean up stale data that is not at the "end"
> of the table. (See Jeff Janes 11/10/2012 email)

I thought about posting a clarification to what Jeff said there -- it
is only necessary to run VACUUM FULL to "free up" space in the sense
of having the database give disk space back to the OS so that the
next time you need space in the table it must be re-allocated from
the OS. This can be a bad thing in terms of slower allocation of
space for new tuples and allocation of non-contiguous disk space. You
are much better off running autovacuum aggressively enough that you
don't need to run VACUUM FULL than to run it periodically. I have
gone for years at a time on large databases (up to several TB)
without ever running VACUUM FULL on a table. It is a valuable tool to
recover from a bad state due to failure to properly maintain the
database; it is not a substitute for doing things right.

>> Long running transactions can cause autovacuum processes to stall
>> out or be autocancelled. "Long running transactions" - is now
>> long? In our system it's rare to have a transaction (even a
>> prepared transaction) last much longer than a few minutes. Is that
>> enough time to cause problems with AutoVacuum?

The only situation where I would expect that to be a problem is in a
very small table which is updated hundreds of times per second.

> Should I increase the max_workers field from the default of 3 to 
> (perhaps) 10? Noting that my solution to the disk space problem is 
> effectively a max_worker of 1 since it's all done sequentially, I
> wonder if reducing max_workers would actually be better?

I would increase autovacuum_max_workers.

> Also, what's the "thread count" ? Is that max_workers?

I believe that's what was meant.

> Why would I want to reduce the cost delay to 0, and how does this
> relate to cost_limit?

I've never gone all the way to 0. I would be reluctant to go below
10; rather I would increase the cost limit. Autovacuum adds costs for
what it is doing, and when it hits the limit it sleeps for the cost
delay interval. I would think you would want to allow the process to
work on other things periodically.

> given my substantial I/O subsystem, I'd want to drop cost_delay to
> near zero and set the cost_limit really high, which is a rough
> restatement of the last quoted paragraph above. (I think)

I agree.

> Assuming that I make these suggestions and notice a subsequent
> system load problem, what information should I be gathering in
> order to provide better post-incident forensics?

Run a report on bloat periodically, to make sure it is staying under
control. You might want to monitor for long-running transactions and
prepared transactions which the transaction manager has lost track
of. The can be real killers.

During any incident of poor performance, it is good to gather iostat
or vmstat samples at an interval of 1 to 10 seconds for a few
minutes, and to capture the contents of pg_stat_activity and
pg_locks.

Turning on logging of checkpoint activity, autovacuum activity, and
slow queries can provide useful information when you match times from
the logging up against the times of slow periods.

> Lastly, there's the question of reindexing before full vacuum.
> I've observed that not doing a manual reindex prior to vacuum full
> did not, in fact, free up the space, even though I've been told
> that reindex is implicit in the vacuum process. (?!) I'm confident
> that I can reproduce this behavior given a bit of time to allow one
> of our database servers to bloat back up.

You never did say what version that was on. In 9.0 and later, VACUUM
FULL will drop and recreate all indexes from scratch after rebuilding
the heap, so it's pretty hard to imagine why dropping and recreating
all indexes right *before* rebuilding the heap would have much
impact. Prior to 9.0, VACUUM FULL would bloat indexes, so it was
always a good idea to REINDEX in addition to running VACUUM FULL,
although it was much more effective to run it *after* the VACUUM
FULL.

-Kevin

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general