Alvaro Herrera wrote:
Csaba Nagy wrote:
Alternatively, perhaps a threshold so that a table is only considered
for vacuum if:
(table-size * overall-activity-in-last-hour) < threshold
Ideally you'd define your units appropriately so that you could just
define threshold in postgresql.conf as 30% (of peak activity in last 100
hours say).
No, this is definitely not enough. The problem scenario is when
autovacuum starts vacuuming a huge table and that keeps it busy 10 hours
and in the meantime the small but frequently updated tables get awfully
bloated...
The only solution to that is to have multiple vacuums running in
parallel, and it would be really nice if those multiple vacuums would be
coordinated by autovacuum too...
Yes, I agree, having multiple "autovacuum workers" would be useful.
Bruce, I think there are a couple of items here that might be worth
adding to the TODO list.
1) Allow multiple "autovacuum workers": Currently Autovacuum is only
capable of ordering one vacuum command at a time, for most work loads
this is sufficient but falls down when a hot (very actively updated
table) goes unvacuumed for a long period of time because a large table
is currently being worked on.
2) Once we can have multiple autovacuum workers: Create the concept of
hot tables that require more attention and should never be ignored for
more that X minutes, perhaps have one "autovacuum worker" per hot table?
(What do people think of this?)
3) Create "Maintenance Windows" for autovacuum: Currently autovacuum
makes all of it's decisions based on a single per-table threshold value,
maintenance windows would allow the setting of a per-window, per-table
threshold. This makes it possible to, for example, forbid (or strongly
discourage) autovacuum from doing maintenance work during normal
business hours either for the entire system or for specific tables.
None of those three items are on the todo list, however I think there is
general consensus that they (at least 1 & 3) are good ideas.