On 08/08/2012 09:39 AM, Stephen Frost
wrote:
I'd really like a "VERIFY" command for PostgreSQL, though a proper one isn't really possible without block checksums.Terry, * Terry Schmitt (tschmitt@xxxxxxxxxxxxxxxx) wrote:So far, executing pg_dumpall seems to be fairly reliable for finding the corrupt objects after my initial data load, but unfortunately much of the corruption has been with indexes which pgdump will not expose.Shouldn't be too hard to write a script that'll do a query against each table using an ORDER BY that matches each index, at least for 'simple' indexes, which'll typically cause an in-order index traversal. I'm currently working on a virtual plug pull tool that uses VMs to simulate abrupt crashes of the machine PostgreSQL is running on. One of the bigger challenges is that Pg doesn't offer any reliable way to detect even simple corruption. Maybe a pg_read_relation(oid) that simply reads all blocks in an index or table would help. It could live in the `adminpack' module ( http://www.postgresql.org/docs/9.1/static/adminpack.html) or `pageinspect' module ( http://www.postgresql.org/docs/9.1/static/pageinspect.html). It turns out I can use the pageinspect functions to do a rough kind of verify, but it's pretty slow and inconvenient. Eg: WITH pages(page) AS ( takes 90ms when a 'SELECT * FROM tablename' takes 6.2ms . On a bigger table, the query takes 3939.912 vs 125.135ms for a table scan. Of course, pageinspect is mostly interesting for indexes, where I'd do: create or replace function scan_index(indexname text) returns setof record as $$ ... but that's getting really quite slow and still hasn't touched the free space map or visibility map. Of course, these checks prove nothing about subtle corruption or incorrect contents, they only make sure Pg can read them and they look vaguely sane. It doesn't do any kind of consistency checking between index and table. -- Craig Ringer |