Ways to change a database collation with removing duplicates

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Could anyone suggest to me the ways to change a database collation with removing all the duplicates, caused by this change?

I have a pretty large database (around 500 Gb) that was created with `en_US.UTF-8` collation, but the new version of the application requires that the collation should be strictly `C`.

I can successfully create a dump of the old database using `pgdump`.

But when I'm importing the dump to the new database with `COLLATE=C`, I see a lot of errors on ALTER TABLE when creating primary keys, and the same - for `CREATE INDEX` commands:

ALTER TABLE
ERROR:  could not create unique index "access_tokens_pkey"
DETAIL:  Key (id)=(16734) is duplicated.
ERROR:  could not create unique index "access_tokens_token_key"
DETAIL:  Key (token)=(XXX) is duplicated.
CONTEXT:  parallel worker
ERROR:  could not create unique index "account_data_uniqueness"
DETAIL:  Key (user_id, account_data_type)=(@username:XXX, im.vector.setting.breadcrumbs) is duplicated.
CREATE INDEX
ERROR:  could not create unique index "e2e_cross_signing_keys_idx"
DETAIL:  Key (user_id, keytype, stream_id)=(@-----------------:matrix.org, master, 1606172) is duplicated.
ERROR:  could not create unique index "e2e_cross_signing_keys_stream_idx"
DETAIL:  Key (stream_id)=(1779009) is duplicated.
So, could anyone give some advice on how to perform the collation change with cleaning out all the duplicates? Thanks!

--
Best regards,
Alexey Murz Korepov.
E-mail: murznn@xxxxxxxxx
Messengers: Matrix - https://matrix.to/#/@murz:ru-matrix.org Telegram - @MurzNN

[Index of Archives]     [Postgresql Home]     [Postgresql General]     [Postgresql Performance]     [Postgresql PHP]     [Postgresql Jobs]     [PHP Users]     [PHP Databases]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Yosemite Forum]

  Powered by Linux