n Fri, Oct 29, 2010 at 2:53 AM, Peter Geoghegan <peter.geoghegan86@xxxxxxxxx> wrote: > On 29 October 2010 03:04, Karl Pickett <karl.pickett@xxxxxxxxx> wrote: >> Hello Postgres Hackers, >> >> We have a simple 'event log' table that is insert only (by multiple >> concurrent clients). It has an integer primary key. We want to do >> incremental queries of this table every 5 minutes or so, i.e. "select >> * from events where id > LAST_ID_I_GOT" to insert into a separate >> reporting database. The problem is, this simple approach has a race >> that will forever skip uncommitted events. I.e., if 5000 was >> committed sooner than 4999, and we get 5000, we will never go back and >> get 4999 when it finally commits. How can we solve this? Basically >> it's a phantom row problem but it spans transactions. >> >> I looked at checking the internal 'xmin' column but the docs say that >> is 32 bit, and something like 'txid_current_snapshot' returns a 64 bit >> value. I don't get it. All I want to is make sure I skip over any >> rows that are newer than the oldest currently running transaction. >> Has nobody else run into this before? > > If I understand your question correctly, you want a "gapless" PK: > > http://www.varlena.com/GeneralBits/130.php > -- > Regards, > Peter Geoghegan > That's interesting, but we're fine with having gaps in the range that never appear. We also don't want to add a performance penalty for concurrent writers. We just don't want any ids to appear (commit) after we got a later id. To clarify, we are using a plain serial primary key and we already have plenty of holes - that's fine. We just want to do an incremental 'tail -f' of this giant table (along with some joins) to feed into a reporting server every few minutes. So we're treating it like a queue, but not deleting anything and having absolute real-time data is not required. It appears that theoretical options are: 1. Start a serializable transaction and wait until all earlier transactions are gone (query pg_stat_activity or something?) 2. Ignore rows that were created later than any other in progress transactions Both of these options assume that serials can never go backward as they're handed out to connections / xids. I think that's safe to assume? Either would be fine, I just don't know if they're officially supported by postgres. -- Karl Pickett -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general