On 19/04/2010 4:51 PM, Craig Ringer wrote:
There might be another possible approach that uses the system "xmin/xmax" fields of each tuple. That'd permit your incremental dumps to be done read-only, saving you a whole lot of expensive I/O and bloat. I'm just not sure what I'm thinking of will work yet.
Yeah. You can use a SERIALIZABLE transaction and txid_current_snapshot(), which almost magically solves your problem. In a single call it provides all the details about active and committed transactions at the time of snapshot creation that you need. It even gives you a list of transaction IDs for uncommitted transactions between those ranges so old uncommitted transactions don't force you to repeatedly dump data. That's all you need to know to do intelligent incremental backup of a table.
I haven't written the actual test code, but what you should need to do (according to my probably flawed understanding) is:
- Begin a read only SERIALIZABLE transaction - Record txid_current_snapshot(), which you will need for the next run. We'll call the value of the last run's txid_current_snapshot() call 'txprev'. - SELECT all rows that have: tablename.xmin > current_snapshot_xmax(txprev) OR tablename.xmin IN (txid_snapshot_xip(txprev)) ( The above doesn't consider deletion. Deletion is never a fun thing to handle in incremental backups/dumps. I'm not presently sure how it should be handled or if it *can* be handled without help from VACCUM and/or an ON DELETE trigger ). If I get a chance, I'll play with this and see if it works in practice. See: http://www.postgresql.org/docs/current/interactive/functions-info.html Table 9-52. Transaction IDs and snapshots http://www.postgresql.org/docs/current/interactive/storage-page-layout.html \df pg_catalog.tx* -- Craig Ringer -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general