Marko Kreen wrote:
4-byte xids on btree may create data corruption.
Can you be more specific on this? I'm aware of xid being an unsigned
integer which means we need to deal with the cast resulting in negative
numbers. This means we have to split our range queries into several
pieces when crossing overflow points. But I'm not aware of any other
issues.
Solution is to use 8-byte txids via txid_current() for indexing. [1]
The main issue with 8-byte values are the size. When multipled by 500
million rows it turns into a lot of space. I'd prefer to use int4 if
it's possible ...
Sorry, I'm not sure what you're suggesting with txid_current(). We're
currently using the |txid_current_snapshot|() method which returns us
the maximum transaction id plus in-flight transactions. We specifically
exclude transactions that are in-flight from the query, then include
them on subsequent queries when they have committed.
See pgq.batch_event_sql() function in Skytools [2] for how to
query txids between snapshots efficiently and without being affected
by long transactions.
I'll take a look.
In fact perhaps you can use PgQ directly instead building your own.
It is built quite similarly to what you are planning - periodic
snapshots and then queries on txids to get the data.
I must admit I'm not very familiar with PgQ, but I did take a quick look
when evaluating options. I'm possibly misguided here but serialising
everything via a queue doesn't seem like the most efficient way of
replicating large changesets. Another issue is that it would only allow
read-once behaviour that prevents a client from re-replicating if
something goes wrong.
How would you trigger message creation using PgQ? Would you typically
use triggers on the tables to be replicated?
Brett
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general