Invalid page header

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I have a server, 8.4.3, where I get intermittent and rather rare cases of "invalid page headers". Quick search over the pg lists shows a general advice to "check your hardware". Yes, I need to schedule a downtime and perform some checks.

However, let me also share with you what I noticed and maybe you can comment or suggest more than that.

As I said, I already had a few cases of invalid page header on that server, but did not take an extensive care of them, as they always were related to the same table, or its indexes. They could be easily dropped and rebuilt, because that table depended on other tables. So I was happy with doing just that. There were just a few such cases within 10 months of lifetime of this server (and that was the actual reason I reported autovacuum getting messed with invalid page header not taken care of for a long time, earlier this year).

But the last time the invalid page header happened to another table, which, actually, is a master source for many other tables in my database, so I had to really take care of this case. What I have noticed about this case was:

- this is a costantly growing table collecting raw information. The data contained in the damaged page was accessed several times after its insertion within a few hours, before finally a yet another access ended with "invalid page header" error.

- there was exactly one page damaged. No other damages around. The system is running on freebsd7.2, ufs with 16k block size, on a raid10 with 256 stripe size, if this matters

- when playing with pg_filedump I noticed that last pages of the table are always initially reported as damaged, as they come, then, as newer pages get allocated and filled, these initially bad pages "become valid", as in the following example repeating the same pg_filedump.

[pgsql@gil ~]$ pg_filedump data/base/18319/36870.43 | grep -B9 -i "invalid header" | grep ^Block
Block 7460 ********************************************************
Block 11457 ********************************************************
Block 11460 ********************************************************
Block 11461 ********************************************************
[pgsql@gil ~]$ pg_filedump data/base/18319/36870.43 | grep -B9 -i "invalid header" | grep ^Block
Block 7460 ********************************************************
Block 11460 ********************************************************
Block 11461 ********************************************************
Block 11462 ********************************************************
[pgsql@gil ~]$ pg_filedump data/base/18319/36870.43 | grep -B9 -i "invalid header" | grep ^Block
Block 7460 ********************************************************
Block 11461 ********************************************************
Block 11462 ********************************************************
Block 11463 ********************************************************

- Block 7460 above is the one which actually got currupted. In spite I zeroed it with the zero_damaged_pages option it is still reported as invalid

Do the above remarks indicate that something else, other than hard-to-find hardware issue, might be tracked in a more detailed way?

Thanks

Irek.


--
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux