I'm trying to follow the discussion on -hackers and decided I'd try putting everything I'm reading into my own words. It is probable some or even all of the following is simply wrong so please do not go acting on it without other people providing supporting evidence or comments. I am a database user, not a database programmer, but feel I do have a solid ability to understand and learn and enough experience to adequately represent a moderately skilled DBA who might see the release notes and scratch their head. During the application/restoration (replay) of WAL the modification of indexes may not be performed correctly resulting in physical rows/keys not having matching index entries. Any subsequent attempt to add a duplicate of the existing physical key will therefore succeed; thus resulting in a duplicate physical record being present and any future attempt to REINDEX the unique index column(s) to fail. A typical replay scenario would first have a row on the PK side of a relationship updated (though not any of the key columns - or any other indexed columns - since this is hot-related...?). If this update takes sufficiently long (or concurrency is otherwise extremely high) that another transaction attempted to take a lock on the PK (e.g., so that it could validate an FK relationship) then a "tuple-lock" operation is performed and added to the WAL. The reply of this WAL entry caused the locked index row to be effectively invisible. The core alteration in 9.3 that exposed this bug is that in order to improve concurrency updates to rows that did not hit indexes (i.e., hot-update capable) allowed other non-updating transactions to simultaneously acquire a lock sufficient to ensure that the core row elements (those that are indexed) remained unaltered while not caring whether specific non-indexed attributes were altered. Prior to 9.3 update locking was sufficient (i.e. exclusive) to cause the other sessions to wait for a lock and thus never hold one simultaneously. In that situation the WAL replay was effectively serialized with respect to the transactions and the index entry modifications were unnecessary but not incorrect. The most obvious test-and-correct mechanism would be to try and create new indexes and, if the creation fails, manually remove the duplicate rows from the table. A table re-write and/or re-index could only work if no duplicates entries were made (I am not sure of this line of thought...) so, for instance, if the PK column is tied to a sequence then no duplicates would ever have been entered. The unknown, for me, is whether MVCC duplication is impacted in which case any update could introduce a duplicate. Regardless of the duplicate record issue as soon as the replay happens the system cannot find the corresponding entry in the index and so any select queries are immediately at risk of silently returning incorrect results. I suppose any related FK constraints would also be broken and those too would remain invisible until the next forced validation of the constraint. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/A-user-s-interpretation-and-thoughts-of-the-WAL-replay-bug-in-9-3-tp5796432.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general