This seems maybe a bit overkill to me. I think what would be more useful is if autovacuum could execute more than one vacuum at a time, and you could specify tables that are high priority (or possibly just say that all tables with less than X live tuples in them are high priority). That way a longer-running vacuum on a large table wouldn't prevent more vacuum-sensative tables (such as queues) from being vacuumed frequently enough. On Wed, Feb 01, 2006 at 03:50:25PM -0500, Chris Browne wrote: > matthew@xxxxxxxx ("Matthew T. O'Connor") writes: > > Hope that helps. Real world feed-back is always welcome. > > While I'm at it, I should throw in an idea that I had a little while > back about a "vacuum request manager." > > This is kind of orthogonal to everything else that has been happening > with pg_autovacuum... > > One of the troubles we have been hitting with our homebrew scripts is > when locking doesn't turn out, and they start submitting multiple > vacuums at once, which sometimes builds up "to ill." > > A thought I had was to create a daemon that would serially process > requests. It would just watch a table of requests, and when it finds > work, start work. > > We'd then have some sort of "injection" process that would tell the > daemon "Here's new work!" > > Requests would be defined thus: > > /* cbbrowne@[local]/dba2 vacdb=*/ \d vacuum_requests > Table "public.vacuum_requests" > Column | Type | Modifiers > --------------+--------------------------+------------------------ > vtable | text | not null > vhost | text | not null > vdatabase | text | not null > urgency | integer | not null default 1 > created_on | timestamp with time zone | not null default now() > completed_on | timestamp with time zone | > failed_at | timestamp with time zone | > Indexes: > "vacuum_requests_pkey" primary key, btree (vtable, vhost, vdatabase, created_on) > "vr_priority" btree (vhost, vdatabase, urgency) WHERE ((completed_on IS NULL) AND (failed_at IS NULL)) > > /* cbbrowne@[local]/dba2 vacdb=*/ \d vacuum_start > Table "public.vacuum_start" > Column | Type | Modifiers > --------------+--------------------------+------------------------ > vtable | text | not null > vhost | text | not null > vdatabase | text | not null > started_on | timestamp with time zone | not null default now() > completed_on | timestamp with time zone | > Indexes: > "vacuum_start_pkey" primary key, btree (vtable, vhost, vdatabase, started_on) > > /* cbbrowne@[local]/dba2 vacdb=*/ \d vacuum_failures > Table "public.vacuum_failures" > Column | Type | Modifiers > ------------+--------------------------+------------------------ > vtable | text | not null > vhost | text | not null > vdatabase | text | not null > started_on | timestamp with time zone | not null > failed_on | timestamp with time zone | not null default now() > Indexes: > "vacuum_failures_pkey" primary key, btree (vtable, vhost, vdatabase, started_on) > > > This has a bit more generality than would be needed for handling just > one postmaster; host/database would allow this to be used to manage > multiple backends... > > We have, in our "kludged-up scripts," three levels of granularity: > > 1. There are tables we vacuum every few minutes; they would be at > urgency 1; every few minutes, we would, in effect, run the query... > > insert into vacuum_requests (vtable, vhost, vdatabase, urgency) > select t.fqtablename, h.hostname, tld.name, 1 > from urgent_tables t, all_hosts h, all_tlds tld; > > 2. Then, there are "hourly" tables, at urgency level 2. > > Once an hour, we run: > > insert into vacuum_requests (vtable, vhost, vdatabase, urgency) > select t.fqtablename, h.hostname, tld.name, 2 > from hourly_tables t, all_hosts h, all_tlds tld; > > 3. Once a day, we'd do something kind of like: > > insert into vacuum_requests (vtable, vhost, vdatabase, urgency) > select table_schema || '.' || table_name, h.hostname, tld.name, 3 > from information_schema.tables, all_hosts h, all_tlds tld > where table_type = 'BASE TABLE' and table_schema in ('public', 'pg_catalog'); > > The event loop for the daemon would be to look up the highest priority > table, and add an entry to vacuum_start. > > Then it vacuums the table. > > If that succeeds, the table is marked as complete in both > vacuum_start, and, FOR EVERY ENTRY CURRENTLY OUTSTANDING, in > vacuum_requests. Thus, if a table is queued up 20 times, it will be > vacuumed once, and marked as done 20 times. > > If that fails, all the relevant entries in vacuum_start and > vacuum_requests are marked with the failure information, and a record > is added to the failures table. > > We're putting this off, pending the thought that, with 8.1, it's worth > testing out pg_autovacuum again. > > The above is an "in-the-database" way of queueing up requests, > associating priorities to them, and having the queue be > administrator-visible. > > We were anticipating using our present quasi-kludgy scripts to add our > favorite tables to the queue; it would seem a nice/natural thing for > there to be some automatic process (ala the pg_autovacuum daemon) that > could add things to the queue based on its knowledge of updates. > > My thought is that if anything about the above appears useful to > pg_autovacuum, I'd be happy if pg_autovacuum grabbed (stole? ;-)) some > of the ideas. > -- > "cbbrowne","@","cbbrowne.com" > http://cbbrowne.com/info/sap.html > "The X-Files are too optimistic. The truth is *not* out there..." > -- Anthony Ord <nws@xxxxxxxxxxxxxxxxxxxx> > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your > message can get through to the mailing list cleanly > -- Jim C. Nasby, Sr. Engineering Consultant jnasby@xxxxxxxxxxxxx Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461