Re: corrupted item pointer in streaming based replication

Tom Lane <tgl@xxxxxxxxxxxxx> · Wed, 03 Apr 2013 18:35:31 -0400

Jigar Shah <jshah@xxxxxxxxxxx> writes:
> We had some disk issues on the primary, but raid verification corrected
> those blocks. That may have caused the primary to be corrupt.

"corrected" for small values of "corrected", I'm guessing :-(

> I have identified the objects, they both are indexes

>         relname         | relfilenode | relkind
> ------------------------+-------------+---------
>  feedback_packed_pkey   |      114846 | i
>  feedback_packed_id_idx |      115085 | i

Hm, well, the good news is you could reindex both of those, the bad is
that there are certainly more problems than this.

> The secondary is the most recent copy. If we could just tell the secondary
> to go passed beyond that corrupt block and get the database started, we
> can then divert traffic to the secondary so our system can run read-only
> until we can isolate and fix our primary. But the secondary is stuck at
> this point and wont start. Is there a way to make the secondary do that?
> Is there a way to remove that block from the wal file its applying so it
> can go passed that point?

No.  You could probably make use of the PITR functionality to let the
secondary replay up to just short of the WAL record where corruption
becomes apparent, then stop and come up normally.  The problem here
is that it seems unlikely that the detected-inconsistent WAL record
is the first bit of corruption that's been passed to the secondary.
I don't have a lot of faith in the idea that your troubles would be over
if you could only fire up the secondary.

It's particularly worrisome that you seem to be looking for ways to
avoid a dump/restore.  That should be your zeroth-order priority at
this point.  What I would do if I were you is to take a filesystem
backup of the secondary's entire current state (WAL and data directory)
so that you can get back to this point if you have to.  Then try the
PITR stop-at-this-point trick.  Be prepared to restore from the
filesystem backup and recover to some other stopping point, possibly a
few times, to get to the latest point that doesn't have clear
corruption.

Meanwhile you could be trying to get the master into a better state.
It's not immediately obvious which path is going to lead to a better
outcome faster, but I wouldn't assume the secondary is in better shape
than the primary.  On the master, again it seems like a filesystem dump
ought to be the first priority, mainly so that you still have the data
if the disks continue the downward arc that it sounds like they're on.

In short: you're in for a long day, but your first priority ought to be
to make sure things can't get even worse.

			regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general