On Tue, 2007-01-16 at 07:16 -0800, Darcy Buskermolen wrote: > On Tuesday 16 January 2007 06:29, Alvaro Herrera wrote: > > elein wrote: > > > Have you made any consideration of providing feedback on autovacuum to > > > users? Right now we don't even know what tables were vacuumed when and > > > what was reaped. This might actually be another topic. > > > > I'd like to hear other people's opinions on Darcy Buskermolen proposal > > to have a log table, on which we'd register what did we run, at what > > time, how long did it last, how many tuples did it clean, etc. I feel > > having it on the regular text log is useful but it's not good enough. > > Keep in mind that in the future we may want to peek at that collected > > information to be able to take better scheduling decisions (or at least > > inform the DBA that he sucks). > > > > Now, I'd like this to be a VACUUM thing, not autovacuum. That means > > that manually-run vacuums would be logged as well. > > Yes I did intend this thought for vacuum, not strictly autovacuum. I agree, for all VACUUMs: we need a log table. The only way we can get a feedback loop on what has come before is by remembering what happened. Simply logging it is interesting, but not enough. There is some complexity there, because with many applications a small table gets VACUUMed every few minutes, so the log table would become a frequently updated table itself. I'd also suggest that we might want to take account of the number of tuples removed by btree pre-split VACUUMs also. I also like the idea of a single scheduler and multiple child workers. The basic architecture is clear and obviously beneficial. What worries me is how the scheduler will work; there seems to be as many ideas as we have hackers. I'm wondering if we should provide the facility of a pluggable scheduler? That way you'd be able to fine tune the schedule to both the application and to the business requirements. That would allow integration with external workflow engines and job schedulers, for when VACUUMs need to not-conflict with external events. If no scheduler has been defined, just use a fairly simple default. The three main questions are - what is the maximum size of VACUUM that can start *now* - can *this* VACUUM start now? - which is the next VACUUM to run? If we have an API that allows those 3 questions to be asked, then a scheduler plug-in could supply the answers. That way any complex application rules (table A is available for VACUUM now for next 60 mins, table B is in constant use so we must use vacuum_delay), external events (long running reports have now finished, OK to VACUUM), time-based rules (e.g. first Sunday of the month 00:00 - 04:00 is scheduled downtime, first 3 days of the each month is financial accounting close) can be specified. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com