Re: Restore and Recover Database

Tom Arthurs <tarthurs@xxxxxxxxxxxx> · Tue, 25 Jul 2006 18:05:41 -0700

Did you have your archive_command configured before you started this 
test (before starting the db?)...

also did the tx_logs actually get saved?  It looks to me that you don't 
have any valid archives.  Also somewhat suspicious that it's starting 
with serial 1 for the transaction log -- which would seem to indicate 
that you have not been archiving logs in the past on this DB.

In fact now that I re-read your messages, it seems to me that the achive 
log feature is not working for you -- you copied all the files out of 
the tx_log directory to your backup directory, which probably put the 
log files there.  I think you need to be more carefull with this -- look 
at what log files are in your backup directory, and only copy those 
whose serial number is greater than the newest archive -- should be at 
most one file.

what you should do:

1.  change postgresql.conf to as you did, to turn on log file archiving.
2.  restart postgresql
3.  create some transactions, and see if tx log files are being save -- 
and note that they are 16MB each, so it can take a lot of transactions 
to trigger an archive.
4.  create backup
5.  create test table, delete, shutdown db.
6.  restore backup, or point postgresql to the backup data directory, 
create recovery.conf as stated.
7.  start postgesql on second data directory, observe logs -- you should 
see each one replaying untill all transactions are replayed, then the db 
will finish starting up.

Another potential problem I see with your procedure is using tar -- 
which may fail when db files change while it's running.  It will work on 
a quiet db with no changes taking place (maybe) but tar tends to fail 
when files are changed and deleted.  rsync is a better choice for backing.

Also you should use your back to create a new data directory -- either 
totally delete the old one and un-tar or -- better -- create a new data 
directory and untar into that.  (don't do this on your production server 
until you have the procedure down cold).

Yes, I've tested this -- in fact we failed over our production db to our 
standby db and back twice in the past few weeks due to some disk array 
failures (had to replace more than one disk), and we lost no 
transactions or data.

Alexander Burbello wrote:
Sorry insist in this question, but did someone try to restore and 
recover the database, and check if no data is lost??

I tryed to do some steps following the Postgres documentation, but ... 
I couldn't recover.
Anybody has some tips or suggestion?

Thanks in advance.

I followed the steps based on the site, but I couldn't finish 
succesfully.

I did:

1. Put the database on Backup Mode and copy datafiles.
   /pg/bin/psql dbdev -c "SELECT pg_start_backup('/pg/backup/');"
   tar -cvf /pg/backup/bk_base.tar /pg/data/base/*
   /pg/bin/psql dbdev -c "SELECT pg_stop_backup();"

   File .conf: archive_command = 'cp -i %p /pg/backup/xlog/%f </dev/null'

2. Created a new table and populated with data, to simulate the recovery:
   create table test (
    aa integer,
    bb varchar(50)
   );

   insert into test values (1,'aaa');
   ...
   insert into test values (5,'aaa');

   Data inserted successfully!!!

3. Shutdown on database;

   Last log transactions copied to the directory archived log;
   cp /pg/data/pg_xlog/* /pg/backup/xlog/

4. Configuring the recovery.conf file:

   restore_command = 'cp /pg/backup/xlog/%f %p'
   recovery_target_time = '2006-07-06 16:33:52 BRT'

5. Simulate the lost directories, deleting... :

   rm -r /pg/data/base/*

6. Recreating the directories exploding the tar file:

   tar -xvf bkp_base... .tar

7. Starting the database for applying the log transactions.
   Supposing recove the table "test" located on log transactions.

   LOG:  database system was shut down at 2006-07-06 16:47:18 BRT
   LOG:  starting archive recovery
   LOG:  restore_command = "cp /pg/backup/xlog/%f %p"
   LOG:  recovery_target_time = 2006-07-06 16:33:52-03
   cp: cannot stat `/pg/backup/xlog/00000001.history': No such file or 
directory
   LOG:  restored log file "000000010000000000000001" from archive
   LOG:  record with zero length at 0/1122880
   LOG:  invalid primary checkpoint record
   LOG:  restored log file "000000010000000000000001" from archive
   LOG:  record with zero length at 0/1122844
   LOG:  invalid secondary checkpoint record
   PANIC:  could not locate a valid checkpoint record
   LOG:  startup process (PID 3989) was terminated by signal 6
   LOG:  aborting startup due to startup process failure

8. There was an error and the table was lost!!!!!!!!!!!!!!!!!

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match