Re: autovac hung/blocked

"Ed L." <pgsql@xxxxxxxxxxxxx> · Wed, 15 Nov 2006 10:21:16 -0700

On Wednesday November 15 2006 6:30 am, Alvaro Herrera wrote:
>
> > The table in
> > question appears to be the pathological case for vacuum: 
> > very large with lots of frequent UPDATEs.  It's essentially
> > a log table.
>
> A big log table where the log entries are being updated? 
> Certainly sounds like a recipe for vacuum headaches.

I'm curious to know how others are dealing with this problem, and 
how pgsql might support this issue.

In our case, we have a 1.8GB OS cache, a 30GB DB cache, serving 
around 200 transactions/second from a 110GB DB, and this 
problematic table is 15GB on disk.  So when it is vacuumed, I 
suspect it essentially flushes the OS cache and half the DB 
cache, severely impacting performance in an already 
cpu-bottlenecked machine.  I have attempted to adjusted autovac 
to spread out its I/O impact, but then it takes so long to run 
that other smaller frequently-updated tables are not 
vacuumed/analyzed in the meantime and performance starts to 
suffer.

Suppose there simply are no "off-hours" periods when you can 
vacuum a very large table with many frequent updates.  (There is 
never a good time to flush the caches.)  How do you manage such 
a table in a 24x7 environment?

One idea would be to partition the table some how such that the 
chunks getting vacuumed are much smaller and thus not such an 
impact.  On the app side, I suppose we could break the table 
into multiple tables on some dimension (time) to make the vacuum 
impacts smaller.

But a pgsql solution in the future would be nice.  I don't know 
the pgsql code well, but what if the freespace map was divided 
into smaller sized sectors such that individual sectors could be 
vacuumed without having to hit the entire table?  Or even simply 
breaking up the 15GB vacuum job into 1GB chunks with some 
spacing between would help.  Of course, it'd be important to 
keep the smaller tables vacuumed/analyzed as needed in the 
in-between time.  

I don't know what the best answer is here, just groping for 
ideas.

TIA.
Ed