Re: index and table corruption

Shaun Thomas <sthomas@xxxxxxxxxxxxxxxx> · Thu, 19 Dec 2013 13:23:10 -0600

On 12/19/2013 12:42 PM, Anand Kumar, Karthik wrote:

> ERROR:  index "mv_visits_p03_n2" contains unexpected zero page at block
> 15939
> ERROR:  invalid page header in block 344713 of relation
> pg_tblspc/4376157/PG_9.1_201105231/16393/8367465

I don't care what kind of checks your admins have performed. You have
either bad memory, a bad controller card, SAN, or an otherwise unstable
system. Do not continue to use this platform in its current state if you
care about your data.

> A majority of the tables are the same each time, although new ones will
> come in, and old ones will go out. A total of about 84 out of 452 tables
> have gotten this error so far.

This is screaming memory or disk-based corruption.

> We run postgres verion 9.1.2, installed via the PGDG rpms.
> The server runs centos5.6, and the disk backend is Netapp based SAN
> Its a 24CPU box, with 768G RAM.
> The database is about 1TB. Its a single database cluster.

That's a pretty nice system. :)

> - We set zero_damaged_pages = on, ran a full vacuum and re-index of 4
> tables. Both the full vacuum and reindex completed successfully, with no
> errors. The same tables showed up when it failed again.

Because they're being corrupted again.

> - We've had the sysadmins check for errors with the hardware  no errors
> so far about any h/w problems, either on the box, with the SAN switches,
> or on the filer. We are going to switch over to a different server on the
> same SAN backend, to see if that helps

Do this. Do nothing else but this. Regardless of the checks the admins
have run, you need to verify the data remains uncorrupted by removing
variables. If this doesn't help, your SAN itself may be the problem.

> - We suspected it might have something to do with
> http://wiki.postgresql.org/wiki/20120924updaterelease, and upgraded to
> postgres 9.1.11, that hasn't helped.

You should do that anyway. There is literally no benefit to running an
old minor release. Unlike 9.1 to 9.2, 9.1.2 to 9.1.11 costs nothing but
a database restart. The amount of potential data corruption, crash, and
planner bugs you avoid by doing so should never be underestimated.

> - We had shared_buffers set to 60G, and reduced that down to 8G, and then
> to 4G, suspecting problems with the background writer handling such high
> shared buffers, that hasn't helped either.

60GB is way too high. Use 4GB or 8GB, like you said. 60GB means a very
very long checkpoint, requires a far longer completion_timeout and
checkpoint segments to really be useful, and greatly amplifies recovery
time.

> log_min_duration_statement   | 0

You're logging every statement to pass through the server? If you have a
lot of volume, that's a ton of IO all by itself. Enough that I would
recommend putting the logs on another LUN entirely.

> maintenance_work_mem         | 32MB

This is way too low. This setting is used for doing vacuum operations
and other work necessary to maintain the database and its contents. You
have more than enough memory to set this at 1GB.

> synchronous_commit           | off

Turn this back on. So long as you're having corruption issues, you need
this to be as stable as possible. Having transactions report commit
before the WAL is successfully written means potentially losing
transactions, especially with the frequent shutdowns these corruptions
are causing.

Aside from that, you're not likely to find much from asking here. You
clearly have a hardware problem somewhere along the chain. Until that's
resolved, you will have random corrupt data on your most frequently
modified objects. Snapshot restores and WAL recovery can help repair the
issues, but it's only a matter of time before a corruption is written
right into the WAL itself, forcing you to do PITR instead.

Good luck!

-- 
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-676-8870
sthomas@xxxxxxxxxxxxxxxx

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general