Gregory Stark wrote:
"Karl Wright" <kwright@xxxxxxxxxxxxx> writes:
This particular run lasted four days before a VACUUM became essential. The
symptom that indicates that VACUUM is needed seems to be that the CPU usage of
any given postgresql query skyrockets. Is this essentially correct?
Postgres is designed on the assumption that VACUUM is run regularly. By
"regularly" we're talking of an interval usually on the order of hours, or
even less. On some workloads some tables need to be vacuumed every 5 minutes,
for example.
Fine - but what if the previous vacuum is still in progress, and does
not finish in 5 minutes?
VACUUM doesn't require shutting down the system, it doesn't lock any tables or
otherwise prevent other jobs from making progress. It does add extra i/o but
there are knobs to throttle its i/o needs. The intention is that VACUUM run in
the background more or less continually using spare i/o bandwidth.
This spare bandwidth is apparently hard to come by in my particular
application. That's the only way I can reconcile your information with
it taking 4 days to complete.
The symptom of not having run vacuum regularly is that tables and indexes
bloat to larger sizes than necessary. If you run "VACUUM VERBOSE" it'll tell
you how much bloat your tables and indexes are suffering from (though the
output is a bit hard to interpret).
Table and index bloat slow things down but not generally by increasing cpu
usage. Usually they slow things down by causing queries to require more i/o.
Yes, that's what I understood, which is why I was puzzled by the effects
I was seeing.
It's only UPDATES and DELETES that create garbage tuples that need to be
vacuumed though. If some of your tables are mostly insert-only they might need
to be vacuumed as frequently or at all.
Well, the smaller tables don't change much, but the bigger tables have a
lively mix of inserts and updates, so I would expect these would need
vacuuming often.
I'll post again when I can find a vacuum schedule that seems to work.
Karl