Search Postgresql Archives

Re: invalid page header

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Just a little followup on this problem.

We've moved the database to another server where it ran without problems.

HP just released new raid controller drivers for Suse and a firmware update for the controller itself.

Until now the problem hasn't occurred anymore.

Thanks!
Jo.

Chris Travers wrote:
Jo De Haes wrote:

OK. The saga continues, everything is a little bit more clear, but at the same time a lot more confusing.

Today i wanted to reproduce the problem again. And guess what? A vacuum of the database went thru without any problems.

I dump the block i was having problems with yesterday. It doesn't report an invalid header anymore and it contains other data!!!

Inconsistant problems esp. with PostgreSQL are usually the result of hardware failure.

Turns out the data that was returned yesterday belongs to another database!

Some more detail about the setup. This server runs 2 instances of postgresql. One production instance which is version 8.0.3. And another testing instance installed in a different folder which runs version 8.1.3 Am I wrong thinking this setup ought to work?


No. Ihave done it before too. PostgreSQL instances running on different ports or addresses are sufficiently isolated to prevent this from being a problem.


Both instances use completely seperated data folders.

So the first dump returned data that actually belongs to an 8.0.3 database (that runs fine). And today without _any_ intervention that same block returns the correct data and the complete database is fine.

Where is the problem?
    The fact that i'm running 2 different instances?
    Cache on raid controller messing up?
    Some strange voodoo?


I would see what sort of memory testing suite you can run on your system first (memtestx86, for example) and go from there. It sounds to me like some sort of a hardware issue. It *could* be bits flipped anywhere, from the writehead on the disk to the main system memory or the CPU.

The likelihood that it is a random RAM error is reduced if you are using ECC RAM. Otherwise it could be anything.

This being said, when I have seen bits flipped by the CPU usually you get a lot of index issues and shared memory corruptions, so I would be more inclined to think that this was RAM or RAID cache.

Best Wishes,
Chris Travers
Metatron Technology Consulting

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux