On 1/16/20 8:50 AM, Richard van der Hoff wrote:
I'm trying to track down the cause of some duplicate rows in a table
which I would expect to be impossible due to a unique constraint. I'm
hoping that somebody here will be able to suggest something I might have
missed.
The problem relates to a bug filed against our application
(https://github.com/matrix-org/synapse/issues/6696). At first I put this
down to random data corruption on a single user's postgres instance, but
I've now seen three separate reports in as many days and am wondering if
there is more to it.
We have a table whose schema is as follows:
synapse=# \d current_state_events
Table "public.current_state_events"
Column | Type | Modifiers
------------+------+-----------
event_id | text | not null
room_id | text | not null
type | text | not null
state_key | text | not null
membership | text |
Indexes:
"current_state_events_event_id_key" UNIQUE CONSTRAINT, btree
(event_id)
"current_state_events_room_id_type_state_key_key" UNIQUE
CONSTRAINT, btree (room_id, type, state_key)
"current_state_events_member_index" btree (state_key) WHERE type =
'm.room.member'::text
Despite the presence of the
current_state_events_room_id_type_state_key_key constraint, several
users have reported seeing errors which suggest that their tables have
duplicate rows for the same (room_id, type, state_key) triplet and
indeed querying confirms that to be the case:
synapse=> select count(*), room_id, type, state_key from
current_state_events group by 2,3,4 order by count(*) DESC LIMIT 2;
count | room_id | type | state_key
-------+-----------------------------------+---------------+-------------------------------------
3 | !ueLfVrSWYGMnFnoCbT:darkfasel.net | m.room.member |
@irc_ebi_:darkfasel.net
3 | !HwocBmCtBcHQhILtYQ:matrix.org | m.room.member |
@freenode_AlmarShenwan_:matrix.org
(2 rows)
I'm assuming the above are obfuscated?
Further investigation suggests that these are genuinely separate rows
rather than duplicate entries in an index.
If you use length() on the values are they the same?
The index appears to consider itself valid:
synapse=> select i.* from pg_class c join pg_index i on
i.indexrelid=c.oid where
relname='current_state_events_room_id_type_state_key_key';
indexrelid | indrelid | indnatts | indisunique | indisprimary |
indisexclusion | indimmediate | indisclustered | indisvalid |
indcheckxmin | indisready | indislive | indisreplident | indkey |
indcollation | indclass | indoption | indexprs | indpred
------------+----------+----------+-------------+--------------+----------------+--------------+----------------+------------+--------------+------------+-----------+----------------+--------+--------------+----------------+-----------+----------+---------
17023 | 16456 | 3 | t | f | f
| t | f | t | f | t
| t | f | 2 3 4 | 100 100 100 | 3126
3126 3126 | 0 0 0 | |
(1 row)
So, question: what could we be doing wrong to get ourselves into this
situation?
Some other datapoints which may be relevant:
* this has been reported by one user on postgres 9.6.15 and one on
10.10, though it's hard to be certain of the version that was running
when the duplication occurred
* the constraint is added when the table is first created (before any
data is added)
* At least one user reports that he has recently migrated his database
from one server to another via a `pg_dump -C` and later piping into psql.
--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx