Re: PITR - warm standby switchover question

Erik Jones <ejones@xxxxxxxxxxxxxx> · Tue, 14 Apr 2009 23:56:43 -0700

Well, if you don't delete the recovery.conf and you *do* delete  
pg_standby's stop file (or it gets deleted, for example if you set it  
to go under /tmp and the server is restarted for whatever reason) the  
server will attempt to go back into recovery mode using your  
configured recovery command in the recovery.conf.  I'm not sure  
exactly what would happen if the stop file is still there but I would  
expect pg_standby to kick in due to the recovery.conf, see the stop  
file, and let the server finish starting up.  However, I haven't  
really tested that before and I certainly wouldn't call it "clean".   
The best option would probably be to write a small script to drop the  
stop file, wait until it can get a valid connection to the db, the rm  
the stop file and mv the recovery.conf to something like  
recovery.conf.inactive or something along those lines.

On Apr 14, 2009, at 5:23 PM, Dan Hayes wrote:

Excellent!  Thanks.  One other quick question...  What would happen  
if I didn't delete the recovery.conf file?  Is that step just to  
prevent accidentally restarting the server with it there?

On Tue, Apr 14, 2009 at 6:26 PM, Erik Jones <ejones@xxxxxxxxxxxxxx>  
wrote:

On Apr 14, 2009, at 3:47 PM, Dan Hayes wrote:

I've followed the implementation instructions at 24.4.2:
http://www.postgresql.org/docs/current/static/warm-standby.html

And I've used the archive/restore commands from the example in F23.2:
http://www.postgresql.org/docs/current/static/pgstandby.html

This all works great.  The primary backs up the WAL archives to a  
mutually accessible folder and the backup system is continuously  
reading in the logs.  Monitoring the "standby.log" file shows that  
the system is working as it should.

My question is, if the main system goes down, what do I do to the  
backup machine to make it come alive?  Currently, when I attempt to  
login to the database, I get the error: "psql: FATAL:  the database  
is starting up"  I'm assuming this means it can't be accessed while  
its in continuous recovery mode.  I attempted to turn off postgresql  
on the main server and try connecting again, same error (was  
obvious, but I do things one step at a time...)  So then I removed  
the "recovery.conf" file on the backup server and attempted to  
restart postgresql.  It stopped fine, but the restart failed.

The pg_log/postgresql-XXXX-XX-XX.log file says:
LOG:  database system was interrupted while in recovery at log time  
2009-04-14 17:36:14 CDT
HINT:  If this has occurred more than once some data might be  
corrupted and you might need to choose an earlier recovery target.
LOG:  could not open file "pg_xlog/0000000100000002000000DD" (log  
file 2, segment 221): No such file or directory
LOG:  invalid primary checkpoint record
LOG:  could not open file "pg_xlog/0000000100000002000000DC" (log  
file 2, segment 220): No such file or directory
LOG:  invalid secondary checkpoint record
PANIC:  could not locate a valid checkpoint record
LOG:  startup process (PID 3756) was terminated by signal 6: Aborted
LOG:  aborting startup due to startup process failure

Any ideas what I'm doing wrong?  How do I bring the standby server  
out of standby do act as the primary?

You should use pg_standby's -t flag to specify a stop file.  Then  
all you need to do is touch (create) that file and pg_standby will  
let the server come out of recovery mode into normal operation  
mode.  Be sure to rm or mv the recovery.conf once that is complete.

Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k

Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general