From: pgsql-admin-owner@xxxxxxxxxxxxxx [mailto:pgsql-admin-owner@xxxxxxxxxxxxxx] On Behalf Of hydra
Sent: Friday, June 05, 2015 12:33 AM
To: pgsql-admin
Subject: Re: replication consistency checking
On Fri, Jun 5, 2015 at 4:40 AM, Sergey Konoplev <gray.ru@xxxxxxxxx> wrote:
On Wed, Jun 3, 2015 at 9:49 PM, hydra <hydrapolic@xxxxxxxxx> wrote:
> After setting up streaming replication, is it possible to check whether the
> slave has the same data as the master?
>
> In the MySQL world there is the percona-toolkit with pt-table-checksum that
> does this job:
> https://www.percona.com/doc/percona-toolkit/2.2/pt-table-checksum.htmlI believe you only need
http://www.postgresql.org/docs/9.4/static/app-initdb.html#APP-INITDB-DATA-CHECKSUMS.
--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA
http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (499) 346-7196, +7 (988) 888-1979
gray.ru@xxxxxxxxx
Thank you all for replies,
while looking for replication information I found this:
http://thebuild.com/presentations/worst-day-fosdem-2014.pdfIt's a real life experience of hitting this replication bug:
https://wiki.postgresql.org/wiki/Nov2013ReplicationIssueThe primary symptom of this corruption is rows that:
- are present on the master, but missing on the replica
- have been deleted on the master still appear to be visible on the replica
- have been updated, and their old versions appear alongside the new, updated versions on the replica
How can I verify whether I already have this corruption?
There is no known way to identify that the issue has affected a standby in the past but comparing the data from the primary with the standby.
So hands up who still think PostgreSQL doesn't need some way of checking the data consistency between master-standby? :)
My hand is up.
From the wiki page that you referenced you forgot to quote this:
“This is an issue, discovered Nov. 18, 2013., which can cause data corruption on a Hot-Standby replica when it is (re-)started, by marking committed transactions as uncommitted. This issue is fixed in the December 5th 2013 update releases.”
So, what’s your point? Yes, any software can (and does) have bugs. What’s important is how quickly the bug is discovered and fixed.
You wish to write a utility that compares data on 2 different clusters – sure, by all means, but I believe time could be better spent on something else.
But, how is to say that this utility will be 100% error-free?
Regards,
Igor Neyman
My point was to show that such a tool would be beneficial even for PostgreSQL. Yes, even the checking utility can have bugs, but at least you are better than now. Because now everybody relies on the fact that it should be ok. But as we see from the bug report, such problems exist here also. So from the situation where you believe everything should be ok you would have a tool reporting "checksums ok" and would allow your to trust your standby data more.