Explaining duplicate rows in spite of unique index

"Albe Laurenz" <laurenz.albe@xxxxxxxxxx> · Tue, 23 Feb 2010 09:41:00 +0100

We recently found a couple of rows in a production database
that had identical values in the columns constituting the primary key
(The problem surfaced because a pg_dump could not be restored).

Now I'm looking for explanations how this could happen.

The rows originate from around the time when we had a hardware
failure that corrupted the file system. The database came up
after a file system check, and people continued working until
we noticed that some tables were corrupted.

At that point we restored an online backup and recovered past
the time of the hardware failure. The WALs were intact and recovery
completed successfully.

Now does the following explanation sound plausible:
- After the file system check, in a table that seemed ok, some
  rows could have vanished or old rows could have become visible.
- Users re-inserted "vanished" rows or updated "old" rows.
  These transactions were recorded in the WALs.
- When we replayed those WALs, starting with a correct backup,
  the "impossible" transactions were replayed and caused
  the duplicate rows we see.

I don't know enough about the recovery process to tell if such
a scenario is possible.

What is your opinion? Are there other explanations (short of
a software bug in PostgreSQL)?

The database version is 8.3.6.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general