Re: Enabling Autovacuum Postgres 9.1 (was Unexpectedly high disk space usage)

Jeff Janes <jeff.janes@xxxxxxxxx> · Tue, 13 Nov 2012 10:31:24 -0800

On Mon, Nov 12, 2012 at 12:04 PM, Lists <lists@xxxxxxxxxxxxxxxxxx> wrote:
>
>
> Should I increase the max_workers field from the default of 3 to (perhaps)
> 10?

I would not.  You report that the reason you turned off autovac is
because it made your database crawl when it kicked in.  That suggests
that if anything you should reduce that parameter (assuming you are
still doing manual vacuums at off-peak hours, leaving autovacuum to
only mop up what is left).

>
> Why would I want to reduce the cost delay to 0, and how does this relate to
> cost_limit? Careful reading of the docs:
> http://www.postgresql.org/docs/9.1/static/runtime-config-resource.html makes
> me believe that, given my substantial I/O subsystem, I'd want to drop
> cost_delay to near zero and set the cost_limit really high, which is a rough
> restatement of the last quoted paragraph above. (I think)

Given that autovac kicking in destroys your performance, I think that
your I/O subsystem may not be all that you think it is.  Do you have
test/dev/QA system with the same subsystem that you can use for
investigation?  If so, do you have a vaguely realistic load generator
to drive those systems?

> Assuming that I make these suggestions and notice a subsequent system load
> problem, what information should I be gathering in order to provide better
> post-incident forensics?

If you are going to be focusing your undivided attention on monitoring
the system during the period, just keeping a window open with "top"
running is invaluable. (On most implementations, if you hit 'c' it
will toggle the command display so you can see the results of
"update_process_title=on")

Also, "sar" is useful, and on most systems has the advantage that its
stats are always being gathered without you having to do anything, so
it works well for unexpected problems arising.  I often just have
"vmstat 1 -t" running in the background streaming into a log file, for
the same reason.

For internal to pgsql, set log_min_duration_statement to a value which
few statements will exceed under normal operations, but many will when
things bog down.  That way you can figure out exactly when things
bogged down after unattended operation, to correlate it with the
sar/vmstat/etc reports.

I'd also set for the probationary period (if you haven't already):

log_lock_waits = on
log_checkpoints = on
log_autovacuum_min_duration = 0 (or some smallish positive value)

The last one only logs when it finishes vacuuming a table.  I wish
there was a way to make it log when it started as well, but I don't
think there is.

Cheers,

Jeff

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general