On 3/25/06, adey <adey11@xxxxxxxxx> wrote:
Two questions in this regard please?1) Is tuple theory not the root of this problem2) Vacuum does much the same as a traditional database reorg, and online reorgs are a reality now1) If I understand tuple theory correctly, copies of rows are created through normal Postgres processing, that expire after a period, leaving multiple copies of redundant data to be vacuumed dead, then vacuumed out (or both). Most databases have been built using one copy of a row with sophisticated locking control mechanism that Postgres has some of anyway, and the industry has developed methods and designs to exploit locking to best advantage. Even with tuples, locking is still evident in Postgres.OR2) Can vacuum full not be redesigned to run online without locking tables and users, like a conventional online reorg, eg: work on 1 data page at a time instead of locking the whole table with a shorter period at the end to lock the table and "compress" the remaining populated data pages and release disk space back to the OS; or one data file at a time, and have vacuum full per table reduce / tidy up the wraparound value, thereby avoiding a full DB vacuum for longer periods. In this way vacuum can be performed regularly and be less intrusive. Nowadays 24x7 is more of a reality for systems and we can't afford to take systems down for many hours to perform regular maintenance.(It would be extremely helpful to DBA's with little OS experience or access to have more automation in PGAdmin, especially task scheduling and alerting, so SQL can be scheduled in PGAmin instead of crontab, which is usually a sysadmin function).
On 3/25/06, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:Peter Eisentraut < peter_e@xxxxxxx> writes:
> Am Freitag, 24. März 2006 05:48 schrieb Tom Lane:
>> Well, the VACUUM FULL algorithm is incapable of shrinking indexes ---
>> the only way is REINDEX, or something else that reconstructs indexes
>> from scratch, such as CLUSTER. One of the things we need to look into
>> is putting more smarts into VACUUM so that it automatically does
>> something reasonable when faced with extreme cases like these.
> If the user is running VACUUM FULL, he has presumably determined that the
> table is too bloated to be recovered in a graceful way, and quite likely the
> indexes are going to be bloated similarly. So seemingly one might as well
> launch a reindexing on the table after VACUUM FULL has done its thing.
> Whether that should be automatic is another question but perhaps the advice
> should be documented somewhere?
Actually, I wonder whether VACUUM FULL shouldn't be thrown away and
replaced by something else entirely. That algorithm only really works
nicely when just a small percentage of the rows need to be moved to
re-compact the table --- if you're moving lots of rows, it makes the
index bloat situation *worse* not better because of the transient need
for index entries pointing to both copies of moved rows. Lazy VACUUM
has become the de-facto standard for situations where there's not a huge
amount of empty space, and so it's not clear where the sweet spot is for
VACUUM FULL anymore. If you've got enough disk space, a rewrite (like
CLUSTER or ALTER TABLE) is going to blow the doors off VACUUM FULL,
let alone VACUUM FULL plus REINDEX. Not to mention that for
sufficiently huge tables, VACUUM FULL fails outright because it runs out
of RAM.
We need to fix CLUSTER to make it MVCC-safe (ie, not discard
recently-dead rows), and it'd be nice to have something like it that
didn't worry about ordering but just did a seqscan of the source table.
Then I'd be inclined to recommend that instead of VACUUM FULL for most
cases of severe bloat.
Unfortunately this all breaks down for shared system catalogs and the
core (nailed-in) catalogs, because we can't change their relfilenodes
and so the crash-safe CLUSTER/REINDEX approach doesn't work. We still
need a new idea or two there.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster