Autovacuum and invalid page header

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I had a problem with neverending forced autovacuum process, running as preventing xid wraparound. As I finally (?) found, following some advices given here: -> http://forums.enterprisedb.com/posts/list/2028.page, that autovacuum in question was not just one autovacuum, but many different autovacuums on the same table cycling in start-fail scenario because of invalid page headers of some indexes of the table. Manual VACUUM VERBOSE said me that, now I can also see that in related ERROR-CONTEXT log message pairs. I dropped the damaged indexes and now that autovacuum seems to continue without repeating itself.

Anyway, another questions come from that situation and I'd like to discuss them here and get some advice.

First, it seems that an unnoticed damage of a relation, causing autovacuum failures, which might initially be harmless for the whole cluster (as far as only damaged relation and its disk space is concerned), may lead to a critical situation when a wraparound gets close, and autovacuum is in prevenitng mode can't freeze its xids. The forced autovacuum insists on vacuuming the table which can not be vacuumed and locks itself there. At the same time, autovacuum get sticked with a database of the damaged relation, and "forgets" about other databases which might need me vacuumed. From this point of view, just one damaged table might be a single point of failure of the whole cluster. Is there any way, other than tracing logs, to prevent such a situation?

Another issue, rather loosely related to the topic. I suspect that the cluster may have more invalid page headers like that. They might be caused by past bad sector failures of one of the drives of my raid array. So I think it might be a good idea to check the whole cluster, page by page, for invalid page headers. Is there any ready tool, which, for instance when given a path to database cluster, would traverse all cluster directories and files and check all page headers? I probably answered myself - manual VACUUM [VERBOSE] would do - but it fails when finds the first invalid header, so I would have. That's why I am looking for something only scanning and reporting invalid headers.

Yet another issue: how could that happen, the bad sectors of a RAID10 member drive caused page damages? (as far as I am right in this assumption, of course, but I have a reason to think so). Should not the RAID array prevent such a file damage?

Thanks

Irek.

--
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux