Search Postgresql Archives

Re: locate DB corruption

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 31, 2018 at 8:48 PM Dave Peticolas <dave@xxxxxxxxxx> wrote:
On Fri, Aug 31, 2018 at 5:19 PM Adrian Klaver <adrian.klaver@xxxxxxxxxxx> wrote:
On 08/31/2018 08:51 AM, Dave Peticolas wrote:
> On Fri, Aug 31, 2018 at 8:14 AM Adrian Klaver <adrian.klaver@xxxxxxxxxxx
> <mailto:adrian.klaver@xxxxxxxxxxx>> wrote:
>
>     On 08/31/2018 08:02 AM, Dave Peticolas wrote:
>      > Hello, I'm running into the following error running a large query
>     on a
>      > database restored from WAL replay:
>      >
>      > could not access status of transaction 330569126
>      > DETAIL: Could not open file "pg_clog/0C68": No such file or directory
>
>
>     Postgres version?
>
>
> Right! Sorry, that original email didn't have a lot of info. This is
> 9.6.9 restoring a backup from 9.6.8.
>
>     Where is the replay coming from?
>
>
>  From a snapshot and WAL files stored in Amazon S3.

Seems the process is not creating a consistent backup.

This time, yes. This setup has been working for almost two years with probably hundreds of restores in that time. But nothing's perfect I guess :)
 
How are they being generated?

The snapshots are sent to S3 via a tar process after calling the start backup function. I am following the postgres docs here. The WAL files are just copied to S3.
 

>     Are you sure you are not working across versions?
>
>
> I am sure, they are all 9.6.
>
>     If not do pg_clog/ and 0C68 actually exist?
>
>
> pg_clog definitely exists, but 0C68 does not. I think I have
> subsequently found the precise row in the specific table that seems to
> be the problem. Specifically I can select * from TABLE where id = BADID
> - 1 or id = BADID + 1 and the query returns. I get the error if I select
> the row with the bad ID.
>
> Now what I'm not sure of is how to fix.

One thing I can think of is to rebuild from a later version of your S3
data and see if it has all the necessary files.

Yes, I think that's a good idea, I'm trying that.
 
There is also pg_resetxlog:

https://www.postgresql.org/docs/9.6/static/app-pgresetxlog.html

I have not used it, so I can not offer much in the way of tips. Just
from reading the docs I would suggest stopping the server and then
creating a backup of $PG_DATA(if possible) before using pg_resetxlog.

Thanks, I didn't know about that. The primary DB seems OK so hopefully it won't be needed.

Well restoring from a backup of the primary does seem to have fixed the issue with the corrupt table. 

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux