On Thu, 2020-01-16 at 16:50 +0000, Richard van der Hoff wrote: > I'm trying to track down the cause of some duplicate rows in a table > which I would expect to be impossible due to a unique constraint. I'm > hoping that somebody here will be able to suggest something I might have > missed. > > The problem relates to a bug filed against our application > (https://github.com/matrix-org/synapse/issues/6696). At first I put this > down to random data corruption on a single user's postgres instance, but > I've now seen three separate reports in as many days and am wondering if > there is more to it. [...] > So, question: what could we be doing wrong to get ourselves into this > situation? > > Some other datapoints which may be relevant: > * this has been reported by one user on postgres 9.6.15 and one on > 10.10, though it's hard to be certain of the version that was running > when the duplication occurred > * the constraint is added when the table is first created (before any > data is added) > * At least one user reports that he has recently migrated his database > from one server to another via a `pg_dump -C` and later piping into psql. I see no hint that this may be the problem, but I have seen corruption like this because of changes in the collations of the C library (which PostgreSQL uses). This only happens with collations other than C, and it cannot be caused by dump/restore. It may, however, be caused by the following: - Upgrading the operating system where PostgreSQL is running to a different glibc. - Streaming replication between machines with different glibc version, and failing over to the standby. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com