Search Postgresql Archives

Re: multiple tables got corrupted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



That depends on what the problem is and how they fix it. Most likely yes -- especially since if you haven't enabled data checksums you won't *know* if things are OK or not. So I'd definitely recommend it even if things *look* OK.

//Magnus


On Wed, Sep 16, 2020 at 5:06 AM Vasu Madhineni <vasumdba1515@xxxxxxxxx> wrote:
I could see block read I/O errors in /var/log/syslog. if those error fixed by OS team, will it require recovery.

Also can i use LIMIT and OFFSET to locate corrupted rows?

Thanks in advance

Regards,
Vasu Madhineni

On Wed, Sep 16, 2020, 01:58 Magnus Hagander <magnus@xxxxxxxxxxxx> wrote:
Try reading them "row by row" until it breaks. That is, SELECT * FROM ... LIMIT 1, then LIMIT 2 etc. For more efficiency use a binary search starting at what seems like a reasonable place looking at the size of the table vs the first failed block to make it faster, but the principle is the same. Once it fails, you've found a corrupt block...

//Magnus


On Tue, Sep 15, 2020 at 12:46 PM Vasu Madhineni <vasumdba1515@xxxxxxxxx> wrote:
Is it possible to identify which rows are corrupted in particular tables.

On Tue, Sep 15, 2020 at 5:36 PM Magnus Hagander <magnus@xxxxxxxxxxxx> wrote:


On Tue, Sep 15, 2020 at 11:15 AM Vasu Madhineni <vasumdba1515@xxxxxxxxx> wrote:
Hi All,

In one of my postgres databases multiple tables got corrupted and followed the below steps but still the same error.

1.SET zero_damaged_pages = on
2. VACUUM ANALYZE, VACUUM FULL
but still same error. 


That is a very destructive first attempt. I hope you took a full disk-level backup of the database before you did that, as it can ruin your chances for forensics and data recovery for other issues.


moh_fa=# VACUUM FULL;
ERROR:  could not read block 9350 in file "base/1156523/1270812": Input/output error

Tried to take backup of tables with pg_dump but same error. files exist physically in base location.

How to proceed on this, no backup to restore.


This is clearly some sort of disk error, and with no backups to restore you will definitely be losing data.

I'd start by figuring out which tables have no corruption and do work, and back those up (with pg_dump for example) as soon as possible to a different machine -- since it's not exactly unlikely that further disk errors will appear.

Once you've done that, identify the tables, and then try to do partial recovery. For example, if you look at the file 1270812, how big it is? PostgreSQL is failing to read block 9350 which is 76595200 bytes into the file. If this is at the very end of the file, you can for example try to get the data out until that point with LIMIT. If it's in the middle of the file, it gets more ticky, but similar approaches can be done. 

Also, unless you are running with data checksums enabled, I wouldn't fully trust the data in the tables that you *can* read either. Since you clearly have disk issues, they may have caused corruption elsewhere as well, so whatever verification you can do against other tables, you should do as well.


You'll of course also want to check any kernel logs or storage system logs to see if they can give you a hint as to what happened, but they are unlikely to actually give you something that will help you fix the problem.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux