Andrew Sullivan wrote:
On Thu, Jun 01, 2006 at 01:54:08PM +0200, Antoine wrote:
Hi,
We just don't seem to be getting much benefit from autovacuum. Running
a manual vacuum seems to still be doing a LOT, which suggests to me
that I should either run a cron job and disable autovacuum, or just
run a cron job on top of autovacuum.
Don't know if this was covered in an earlier thread. Bear with me if so.
I'm working with 7.4.8 and 8.0.3 systems, and pg_autovacuum does have some
glitches ... in part solved by the integrated autovac in 8.1:
- in our env, clients occasionally hit max_connections. This is a known and
(sort of) desired pushback on load. However, that sometimes knocks pg_autovacuum
out.
- db server goes down for any reason: same problem.
Just restarting pg_autovacuum is not good enough; when pg_autovacuum terminates,
it loses its state, so big tables that change less than 50% between such
terminations may never get vacuumed (!)
For that reason, it's taken a switch to a Perl script run from cron every 5
minutes, that persists state in a table. The script is not a plug-compatible
match for pg_autovacuum (hardcoded rates; hardcoded distinction between user and
system tables), but you may find it useful.
--
Engineers think that equations approximate reality.
Physicists think that reality approximates the equations.
Mathematicians never make the connection.
#!/usr/bin/env perl
use strict;
use warnings;
use Carp;
use DBI;
use POSIX qw(strftime);
# Hardcoded (aggressive) autovacuum parameters:
my ($VAC_BASE, $VAC_RATE) = (1000, 0.8);
my ($ANA_BASE, $ANA_RATE) = ( 500, 0.4);
my $VERBOSE = 'VERBOSE';
my $start = time;
my $stamp = strftime "==== %FT%T autovac: ", localtime;
open STDERR, ">&STDOUT"; # Redirect PG "VERBOSE" output.
my $dbh = DBI->connect("dbi:Pg(PrintError=1,RaiseError=1):");
# REVISIT: move this to schema:
my $oid = $dbh->selectall_arrayref(<<"__SQL__")->[0][0];
SELECT oid FROM pg_class WHERE relname = 'autovac_state';
__SQL__
$dbh->do(<<"__SQL__") if !defined $oid;
CREATE TABLE public.autovac_state(
relid oid NOT NULL PRIMARY KEY,
name text NOT NULL,
analyze_time timestamptz,
vacuum_time timestamptz,
analyze_tups bigint, -- (n_tup_ins+n_tup_upd+n_tup_del) at last ANALYZE
vacuum_tups bigint -- (n_tup_upd+n_tup_del) at last VACUUM
);
__SQL__
# Calculate updates/additions to autovac_state:
my $anavac = join ";", map {$_->[0]} @{$dbh->selectall_arrayref(<<"__SQL__")};
SELECT * INTO TEMP new_state FROM (
SELECT
relid, -- identify tables by ID, so that (re)created tables always
-- are treated as fresh tables.
name, -- for constructing the vacuum/analyze command
old_relid, -- NULL means this will need a new state table entry
analyze_tups,-- _tups are used to update autovac_state
vacuum_tups,
CASE WHEN analyze_tups - prev_analyze_tups NOT BETWEEN 0 AND analyze_point
OR old_relid IS NULL
THEN now()
END AS analyze_time,
CASE WHEN vacuum_tups - prev_vacuum_tups NOT BETWEEN 0 AND vacuum_point
THEN now()
END AS vacuum_time
FROM (
SELECT N.nspname || '.' || C.relname AS name,
A.relid AS old_relid,
C.oid AS relid,
S.n_tup_ins + S.n_tup_upd + S.n_tup_del AS analyze_tups,
S.n_tup_upd + S.n_tup_del AS vacuum_tups,
COALESCE(A.analyze_tups,0) AS prev_analyze_tups,
COALESCE(A.vacuum_tups,0) AS prev_vacuum_tups,
CASE WHEN nspname ~ '^pg_' THEN 1.0 ELSE $ANA_RATE END
* C.reltuples + $ANA_BASE AS analyze_point,
CASE WHEN nspname ~ '^pg_' THEN 2.0 ELSE $VAC_RATE END
* C.reltuples + $VAC_BASE AS vacuum_point
FROM pg_class AS C
JOIN pg_namespace AS N ON N.oid = C.relnamespace
JOIN pg_stat_all_tables AS S ON S.relid = C.oid
LEFT JOIN autovac_state AS A ON A.relid = S.relid
WHERE N.nspname NOT LIKE 'pg_temp%'
) AS X
) AS X
WHERE analyze_time IS NOT NULL OR vacuum_time IS NOT NULL;
SELECT CASE WHEN vacuum_time IS NOT NULL
THEN 'VACUUM ANALYZE $VERBOSE ' || name
ELSE 'ANALYZE $VERBOSE ' || name
END
FROM new_state;
__SQL__
if ($anavac) {
print STDERR $stamp."start\n";
$dbh->do(<<"__SQL__");
$anavac;
UPDATE autovac_state
SET analyze_tups = N.analyze_tups,
vacuum_tups = CASE WHEN N.vacuum_time IS NULL
THEN autovac_state.vacuum_tups
ELSE N.vacuum_tups
END,
analyze_time = COALESCE(N.analyze_time, autovac_state.analyze_time),
vacuum_time = COALESCE(N.vacuum_time, autovac_state.vacuum_time)
FROM new_state AS N
WHERE N.relid = autovac_state.relid;
INSERT INTO autovac_state
SELECT relid, name, analyze_time, vacuum_time, analyze_tups, vacuum_tups
FROM new_state
WHERE old_relid IS NULL;
DELETE FROM autovac_state
WHERE analyze_time < now() - '1 day'::INTERVAL
AND relid NOT IN (SELECT oid FROM pg_class);
__SQL__
print STDERR $stamp.(time - $start)." secs\n";
}
$dbh->do("DROP TABLE new_state");
1;
__END__
=head1 SYNOPSIS
autovac - autovacuum with persistent state.
=head1 DESCRIPTION
C<autovac> is a replacement for C<pg_autovacuum>.
C<autovac> does a single C<pg_autovacuum> step, then saves its state in
the C<public.autovac_state> table. It should be run from cron, say,
every 5 minutes.
C<autovac> runs more aggressively for user tables
(pg_autovacuum -V 0.8 -v 1000) than for system tables.
When pg_stats_user_tables.(ins/upd/del) counts have been zeroed,
C<autovac> vacuums all tables.
To check when C<autovac> last analyzed/vacuumed a given table:
SELECT vacuum_time, analyze_time FROM autovac_state
WHERE name = 'public.message'
=head1 OUTPUT
If any action is taken, C<autovac> prints:
==== yyyy-mm-dd hh:mm:ss autovac start
... VACUUM/ANALYZE VERBOSE output ...
==== yyyy-mm-dd hh:mm:ss autovac <n> secs
=head1 PG_AUTOVACUUM
=cut