Am Montag, Juni 01, 2020 12:56 CEST, schrieb Julien Rouhaud <rjuju123@xxxxxxxxx>:
Hello,
On Mon, Jun 1, 2020 at 11:15 AM Bernhard Beroun <bernhard@xxxxxxxxx> wrote:
>
> Hello,
>
> I am experiencing a strange thing on my production database server, which I can't explain.
>
> On my production database server, there is a table called "label_suggestion" which has a unique constraint on the "name" column. If I do a "\d+ label_suggestion" I can see, that the unique constraint shows up with
>
> "label_suggestion_name_unique" UNIQUE CONSTRAINT, btree (name)
>
> in the output. But when I execute the following query
>
> SELECT name FROM label_suggestion GROUP BY name HAVING COUNT(name) > 1
>
> I can see that there are actually entries with a duplicate name in the database, which makes me believe that the unique constraint isn't there at all. (or at least it's not enforced)
It looks like the underlying index is corrupted. Did you have any
issue on that server? If the datatype is collatable, another
possibility would be that the underlying glibc version (or the
equivalent on your system) was upgraded to a version with different
ordering for your collation. Recently, glibc 2.28 is a quite likely
scenario, see for instance
https://postgresql.verite.pro/blog/2018/08/27/glibc-upgrade.html, but
it could also be some bug, possibly in earlier version if you updated
postgres since the initial install.
> Next, I dumped the complete production database (via pg_dumpall) and applied the dump on my local machine. I did again a "\d+ label_suggestion" and now the unique constraint doesn't show up anymore.
There's probably an error displayed during the restore. The constraint
are restored after the data, so there's no way that the constraint can
be restored if you have duplicated values.
You should definitely find out how you ended up in this situation and
fix the root cause before trying to manually clean up the data.
you are spot on with your failure analysis. I initially had PostgreSQL 9.6 running on Debian Stretch (glibc < 2.28) and then migrated to Debian Buster (glibc 2.28) a few months ago.
You are absolutely right, there is indeed an error - completely missed that!There's probably an error displayed during the restore. The constraint
are restored after the data, so there's no way that the constraint can
be restored if you have duplicated values.
Many thanks for your help, Julien - you really helped me with this one. Very much appreciated!
Is there something special I need to consider when cleaning up the duplicated values? My current plan would be to run the migrate script that I've written (which cleans up the mess and get rids of the duplicate values) and then reindex with "REINDEX database <mydatabase>".
Is that right or am I missing something here?
Thanks,
Bernhard