Merlin Moncure wrote: >> postgresql 8.1 supports pitr archiving. you can >> do continuous backups and restore the database to just before the bad >> data. I tried using point-in-time-recovery to restore the state of the database immediately before the corruption. It didn't work, but it was quite a show. Here's the story. After much wailing and gnashing of teeth, I got postmaster to execute a recovery (so that recovery.conf was renamed to recovery.done). But the database was completely screwed after the recovery. Here's an example of the kind of output I saw while executing a simple SELECT statement: postgres=# SELECT entry_date,machine_id,coin FROM collections WHERE entry_date::date>'2009-06-06' ORDER BY entry_date; WARNING: could not write block 32 of 1663/10793/2608 DETAIL: Multiple failures --- write error may be permanent. ERROR: xlog flush request 0/4DC6CC88 is not satisfied --- flushed only to 0/4DC06180 CONTEXT: writing block 32 of relation 1663/10793/2608 Here's the recovery procedure I followed: 0. Shortly after the corruption on June 9, 2009, I shut down the server and backed up the entire data directory. The recovery procedure described herein begins with this file system backup. 1. The most recent non-corrupted snapshot of the database is a pg_dump from May 13, 2009. (I don't have any file system backups from before the corruption.) I restored the database to this snapshot by executing the commands from the May 13 pg_dump on the June 9 corrupted data. 2. I removed the files in the pg_xlog directory and replaced them with the contents of pg_xlog from the corrupted file system backup from June 9. 3. I modified the sample recovery.conf file so as to replay all the transactions right up until the point of corruption. The hope was that postmaster would somehow know to begin replaying transactions at the appropriate point from the May 13 state. I guess it's too much to ask postmaster to do a PITR from a pg_dump backup, as opposed to a file system backup. Bummer. By the way, I can reliably get postmaster to hang during startup if I manually create the pg_xlog\RECOVERYHISTORY and pg_xlog\RECOVERYXLOG directories (even with correct permissions) before starting up the server. When I say "hang", I mean that (i) any attempt to connect is met with the response "FATAL: the database system is starting up", and (ii) "pg_ctl stop" cannot be used to shut down the server -- I have to use "pg_ctl kill" Anyway, I'm going to try implementing Tom's suggestion of writing a program to modify the xmin/xmax values. I expect this approach won't work, as autovacuum was on at the time of corruption. However, the files in the data directory are quite large -- many times larger than a pg_dump. The database sees such a small amount of traffic that it's possible that even vacuum decided not to bother reclaiming the unused storage created by the corrupting transaction (?). Here's hoping. -Gus On Thu, Jun 11, 2009 at 1:43 PM, Gus Gutoski wrote: > Thanks for the replies. > > Tom Lane wrote: >> This being 8.1, if you haven't turned on autovacuum there is some chance >> of that. > > Unfortunately, autovacuum was on. I don't recall ever turning it on, > but this database is over two years old; it's possible that I blindly > followed advice from pgAdmin or something way back when. > > Merlin Moncure wrote: >> does postgresql 8.1 have last_vacuum time in the pg_stat_all_tables >> coiumn? > > I did not see a column called last_vacuum in the pg_stat_all_tables table. > >> postgresql 8.1 supports pitr archiving. you can >> do continuous backups and restore the database to just before the bad >> data. > > I'm learning about this now. I'm cautiously optimistic, as my pg_xlog > directory contains some files whose timestamp is near the time of the > 'incident'. > > By "backup" do you mean the contents of a pg_dump? The most recent > dump was two months ago. I'm worried that my log files might not go > far enough back in time to restore the table from the most recent > dump. > > Both Tom's and Merlin's suggestions carry a significant learning > curve. I'll do what I can in the coming days and post to the list if > anything noteworthy happens. > >> plus, there is no way you are escaping the obligatory 'where are your >> backups?'. :-). > > It's a classic story. I'm volunteering about one day per month for > this project, learning SQL as I go. Priority was always given to the > "get it working" tasks and never the "make it safe" tasks. I had/have > grandiose plans to rewrite the whole system properly after I graduate. > Unfortunately, the inevitable corruption didn't wait that long. > > Cheers. > > -Gus > -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general