2014-02-23 7:08 GMT+01:00 Raghavendra <raghavendra.rao@xxxxxxxxxxxxxxxx>:
Hi Mat,
Thank you for the pointers on pause_at_recovery_target and recovery_target_time. It worked but I encountered an unexpected situation.
I wanted to test three recovery times by checking data at each point and then proceed to the next. It worked as expected first 2 recovery times but the last one did not give me an opportunity to check data. It simply completed recovery and switched timeline. This means I cannot rollforward anymore unless I restore the database again. Do you think I did something wrong?
You did right, for the first two time targets you have some wal files pending hence pause is allowed. Whereas for the last time target there are no archive files to pause so it has just opened the database as completion of PITR.I am able to reproduce this on my local as well.This is from my logs:2014-02-06 18:54:54 PST-11634---[] LOG: restored log file "0000000100000001000000B8" from archive2014-02-06 18:54:55 PST-11634---[] LOG: restored log file "0000000100000001000000B9" from archive2014-02-06 18:54:55 PST-11634---[] LOG: consistent recovery state reached at 1/B90029C82014-02-06 18:54:55 PST-11632---[] LOG: database system is ready to accept read only connectionscp: cannot stat `/opt/PostgreSQL/9.3/archives93/0000000100000001000000BA': No such file or directoryB8,B9 applied and looking for next file BA which is not there and no point of pausing without any files, hence it has opened the database. But my guess is, what it has done is right, there no more files to pause and allow you to query, its like completion of PITR.Same in your case, there may not be any files in your archives directory. You can check for '00000004000000080000000D' file in your archive directory.,2014-02-22 22:04:16 CSTLOG: restored log file "00000004000000080000000C" from archive
cp: cannot stat `/pgdata/backups/xlogs/minerva_ecl_test/00000004000000080000000D': No such file or directory
Raghavendra is right, the recovery applied successful your archived wal segment to the database and ended the loop in the xlog.c when it not found anymore records in the wal segments. Moreover the backend is telling you that the recovery phase ended at "2014-02-22 19:48:29.205898-06" while your target timeline was "2014-02-22 19:50:00" and it's interesting.
Did you create transactions and you committed them at that time or after ?
Normally, when you have to do a restore, probably you had a crash and your filesystem is not ok, so you use a new filesystem on another storage. So your last transactions are "lost", because those transactions was not still archived, infact those transactions were in the last wal segment of the crashed filesystem. If that file is not OK, you could not use it for complete the recovery.
If you have a RPO policy , you have to look at parameter archive_timeout , so you are sure your wal segments will be archived, and if you do that and your database does not create so much transactions , think about to compress them or use data deduplication at storage level or filesystem level , or you will have a lot of wasted space ( the wal segment even if is not full will write 16Mb of <data>+<00000..000>)
Have a nice day
Mat Dba