On 5/25/17 6:30 AM, Tom Lane wrote:
David Wall <d.wall@xxxxxxxxxxxx> writes:
They do have a slave DB running via WAL shipping. Would that likely
help us in any way?
Have you tried taking a backup from the slave? It's possible that
the corruption exists only on the master.
We will give this a try once the customer let's us try. It appears
we'll need to allot considerable downtime to try our options.
Because the warm standby was resynced in October (it was down due to the
OS going into a read-only filesystem for an untold long time that we
only noted when the primary disk was going full with WALs), we believe
we may have 'tar' copied the corrupted data too. But we will first stop
the web apps, then 'tar' backup the database, then stop recovery on the
warm standby and ensure our table counts appear to match production (in
case it has any issues of its own), and see if the warm DB is any
better. If so, we'll restore from there. If not, we'll try the
zero-out bad blocks and see what happens.
Fortunately, this only appears in one of their two DBs. Once we can
successfully dump the DBs, we will migrate to the new hardware and OS
and upgraded PG.
Thanks to you and Karsten Hilbert for your help.
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general