Re: [HACKERS] Autovacuum Improvements

Steve Atkins <steve@xxxxxxxxxxx> · Mon, 22 Jan 2007 15:39:10 -0800

On Jan 22, 2007, at 11:16 AM, Richard Huxton wrote:

Bruce Momjian wrote:
Yep, agreed on the random I/O issue.  The larger question is if  
you have
a huge table, do you care to reclaim 3% of the table size, rather  
than
just vacuum it when it gets to 10% dirty?  I realize the vacuum is  
going
to take a lot of time, but vacuuming to relaim 3% three times  
seems like
it is going to be more expensive than just vacuuming the 10%  
once.  And
vacuuming to reclaim 1% ten times seems even more expensive.  The
partial vacuum idea is starting to look like a loser to me again.

Buying a house with a 25-year mortgage is much more expensive than  
just paying cash too, but you don't always have a choice.

Surely the key benefit of the partial vacuuming thing is that you  
can at least do something useful with a large table if a full  
vacuum takes 24 hours and you only have 4 hours of idle I/O.

It's also occurred to me that all the discussion of scheduling way  
back when isn't directly addressing the issue. What most people  
want (I'm guessing) is to vacuum *when the user-workload allows*  
and the time-tabling is just a sysadmin first-approximation at that.

Yup. I'd really like for my app to be able to say "Hmm. No  
interactive users at the moment, no critical background tasks. Now  
would be a really good time for the DB to do some maintenance." but  
also to be able to interrupt the maintenance process if some new  
users or other system load show up.

With partial vacuuming possible, we can arrange things with just  
three thresholds and two measurements:
  Measurement 1 = system workload
  Measurement 2 = a per-table "requires vacuuming" value
  Threshold 1 = workload at which we do more vacuuming
  Threshold 2 = workload at which we do less vacuuming
  Threshold 3 = point at which a table is considered worth vacuuming.
Once every 10 seconds, the manager compares the current workload to  
the thresholds and starts a new vacuum, kills one or does nothing.  
New vacuum processes keep getting started as long as there is  
workload spare and tables that need vacuuming.

Now the trick of course is how you measure system workload in a  
meaningful manner.

I'd settle for a "start maintenance", "stop maintenance" API.  
Anything else (for instance the heuristics you suggest above) would  
definitely be gravy.

It's not going to be simple to do, though, I don't think.

Cheers,
  Steve