Re: Unexpectedly high disk space usage

Lists <lists@xxxxxxxxxxxxxxxxxx> · Wed, 07 Nov 2012 14:15:17 -0800

On 11/07/2012 12:42 PM, Tom Lane wrote:
So you've turned off autovacuum, and are carefully not vacuuming the
system catalogs.  That's your problem all right.  Is there a
particularly good reason why this script isn't a one-liner "VACUUM"?

Back in the 8.x days, we experienced "vacuum full analyze" occasionally 
causing other processes to hang/timeout. In an attempt to minimize the 
impact of the locking, we updated the script to vacuum one table at a 
time, which seemed to work well throughout the 8.x series. I'd happily 
accept that this conclusion may have simply have been wrong, but it 
worked well enough that nobody complained and life was good. After 
switching to 9.x, we read that the "full" vacuum was less useful and so 
the script was changed to "vacuum analyze $table" rather than "vacuum 
full analyze $table".

Are you sure that once-a-day vacuuming is sufficient, even if it was
covering the system catalogs?  If you've managed to bloat pg_attribute
to 36GB, I suspect you've got enough churn (perhaps from temp tables)
that you really need the catalogs vacuumed more often.
The only thing that I could find in the docs even mentioning the idea of 
vacuuming catalogs is this sentence:

(A manual VACUUM should fix the problem, as suggested by the hint; but 
note that the VACUUM must be performed by a superuser, else it will fail 
to process system catalogs and thus not be able to advance the 
database's datfrozenxid.)
http://www.postgresql.org/docs/9.1/static/routine-vacuuming.html

This does NOT clearly say that the end user could vacuum catalogs, let 
alone that it's necessary or even a good idea. Otherwise, the only 
mention is of tables, and there's no mention of the idea that tables are 
anything but user space.

My advice is dump, reload, and *don't* turn off autovacuum.

... because it
occasionally causes transactions and queries to hang when an update
causes a vacuum mid-day, effectively taking us offline randomly.
I suspect this claim is based on ancient and no longer very relevant
experience.

We tried several times to turn on autovacuum with 9.1 and had problems 
every time. If our use case is particularly special, I'd love to work 
with you to get autovacuum to work in our situation too as it would make 
life easier for us! But for the past few months, every time we've turned 
it on, we've had our phones swamped with customers who are unable to use 
our system while our application monitors scream bloody murder, at least 
weekly.

From what we could tell (under extreme pressure to get it all working 
again ASAP, mind you) it seemed that when doing a large update from 
within a transaction, autovacuum would get triggered before the 
transaction completed, causing the transaction to hang or at least slow 
way down, causing timeouts to occur with load balancers, so customers 
would then try again, compounding the ongoing problem. Pretty soon you 
have not only I/O issues, but also locking issues and upset customers. 
This issue may be compounded because we make fairly extensive use of 
dblink and temp tables to aggregate data for our customers who have 
multiple sites.

-Ben

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general