Did you have your archive_command configured before you started this
test (before starting the db?)...
also did the tx_logs actually get saved? It looks to me that you don't
have any valid archives. Also somewhat suspicious that it's starting
with serial 1 for the transaction log -- which would seem to indicate
that you have not been archiving logs in the past on this DB.
In fact now that I re-read your messages, it seems to me that the achive
log feature is not working for you -- you copied all the files out of
the tx_log directory to your backup directory, which probably put the
log files there. I think you need to be more carefull with this -- look
at what log files are in your backup directory, and only copy those
whose serial number is greater than the newest archive -- should be at
most one file.
what you should do:
1. change postgresql.conf to as you did, to turn on log file archiving.
2. restart postgresql
3. create some transactions, and see if tx log files are being save --
and note that they are 16MB each, so it can take a lot of transactions
to trigger an archive.
4. create backup
5. create test table, delete, shutdown db.
6. restore backup, or point postgresql to the backup data directory,
create recovery.conf as stated.
7. start postgesql on second data directory, observe logs -- you should
see each one replaying untill all transactions are replayed, then the db
will finish starting up.
Another potential problem I see with your procedure is using tar --
which may fail when db files change while it's running. It will work on
a quiet db with no changes taking place (maybe) but tar tends to fail
when files are changed and deleted. rsync is a better choice for backing.
Also you should use your back to create a new data directory -- either
totally delete the old one and un-tar or -- better -- create a new data
directory and untar into that. (don't do this on your production server
until you have the procedure down cold).
Yes, I've tested this -- in fact we failed over our production db to our
standby db and back twice in the past few weeks due to some disk array
failures (had to replace more than one disk), and we lost no
transactions or data.
Alexander Burbello wrote:
Sorry insist in this question, but did someone try to restore and
recover the database, and check if no data is lost??
I tryed to do some steps following the Postgres documentation, but ...
I couldn't recover.
Anybody has some tips or suggestion?
Thanks in advance.
I followed the steps based on the site, but I couldn't finish
succesfully.
I did:
1. Put the database on Backup Mode and copy datafiles.
/pg/bin/psql dbdev -c "SELECT pg_start_backup('/pg/backup/');"
tar -cvf /pg/backup/bk_base.tar /pg/data/base/*
/pg/bin/psql dbdev -c "SELECT pg_stop_backup();"
File .conf: archive_command = 'cp -i %p /pg/backup/xlog/%f </dev/null'
2. Created a new table and populated with data, to simulate the recovery:
create table test (
aa integer,
bb varchar(50)
);
insert into test values (1,'aaa');
...
insert into test values (5,'aaa');
Data inserted successfully!!!
3. Shutdown on database;
Last log transactions copied to the directory archived log;
cp /pg/data/pg_xlog/* /pg/backup/xlog/
4. Configuring the recovery.conf file:
restore_command = 'cp /pg/backup/xlog/%f %p'
recovery_target_time = '2006-07-06 16:33:52 BRT'
5. Simulate the lost directories, deleting... :
rm -r /pg/data/base/*
6. Recreating the directories exploding the tar file:
tar -xvf bkp_base... .tar
7. Starting the database for applying the log transactions.
Supposing recove the table "test" located on log transactions.
LOG: database system was shut down at 2006-07-06 16:47:18 BRT
LOG: starting archive recovery
LOG: restore_command = "cp /pg/backup/xlog/%f %p"
LOG: recovery_target_time = 2006-07-06 16:33:52-03
cp: cannot stat `/pg/backup/xlog/00000001.history': No such file or
directory
LOG: restored log file "000000010000000000000001" from archive
LOG: record with zero length at 0/1122880
LOG: invalid primary checkpoint record
LOG: restored log file "000000010000000000000001" from archive
LOG: record with zero length at 0/1122844
LOG: invalid secondary checkpoint record
PANIC: could not locate a valid checkpoint record
LOG: startup process (PID 3989) was terminated by signal 6
LOG: aborting startup due to startup process failure
8. There was an error and the table was lost!!!!!!!!!!!!!!!!!
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match