Alvaro Herrera wrote:
Matthew O'Connor wrote:
Glen Parker wrote:
If it isn't there somewhere already, I would ask to add:
4) Expose all information used by autovacuum to form its decisions.
You could argue that this is already there, although not easy to get at
I suppose. But all table threshold settings are available either in the
pg_autovacuum relation or the defaults via GUC variables, that plus a
little math will get the information autovacuum uses to form its decisions.
No, we currently don't expose the number of dead tuples which autovacuum
uses.
5) Expose a very easy way to discover autovacuum's opinion about a
particular table, for example "table_needs_vacuum(oid)", ignoring any
time constraints that may be in place.
This might be a nice feature however in the presence of the much talked
about but not yet developed maintenance window concept, I'm not sure how
this should work. That is, during business hours the table doesn't
need vacuuming, but it will when the evening maintenance window opens up.
I intend to work on the maintenance window idea for 8.3. I'm not sure
if I'll be able to introduce the worker process stuff in there as well.
I actually haven't done much design on the stuff so I can't say.
What does a maintenance window mean? I am slightly fearful that it as
other improvements to vacuum are made, it will change it's meaning.
There has been discussion about a bitmap of dirty pages in a relation
for vacuum to clean. Do that effect what maintenance means? eg. Does
maintenance mean I can only scan the whole relation for XID wrap in
maintenance mode and not during non-maintenance time. Does it mean we
don't vacuum at all in non-maintenance mode. Or do we just have a
different set of thresholds during maintenance.
Further to this was a patch a long time ago for partial vacuum, which
only vacuumed part of the relation. It was rejected on grounds of not
helping as the index cleanup is the expensive part. My view on this is,
if with a very large table you should be able to vacuum until you fill
your maintenance work mem. You are then forced to process indexes.
Then that is a good time to stop vacuuming. You would have to start the
process again effectively. This may also change the meaning of
maintenance window. Again, only full relation scans in maintenance
times, possibly something else entirely.
I am unsure of what the long term goal of the maintenance window is. I
understand it's to produce a time when vacuum is able to be more
aggressive on the system. But I do not know what that means in light of
other improvements such as those listed above. Coming up with a method
for maintenance window that just used a separate set of thresholds is
one option. However is that the best thing to do. Some clarity here
from others would probably help. But I also think we need to consider
the big picture of where vacuum is going before inventing a mechanism
that may not mean anything come 8.4
Now, if you (Matthew, or Glen as well!) were to work on that it'll be
appreciated ;-) and we could team up.
I am happy to try and put in some design thought and discussion with
others to come up with something that will work well.
Regards
Russell Smith