Re: PG engine takeover or switch over

Chander Ganesan <chander@xxxxxxxxxx> · Thu, 16 Aug 2007 09:49:19 -0400

Hannes Dorbath wrote:
On 15.08.2007 21:30, Medi Montaseri wrote:
I am looking for some suggestions to optimize the following
problem/solution.

Given two nodes A and B (two computers) in a active-passive mode 
where A is
running PG engine,  and B is simply standing by,  and  a common storage
(twin tailed) ( or from pg_ctl point of view -D /common/data ), I am 
looking
for a faster solution during the takeover where A has crashed and B 
is to
start PG engine and run with it.

My current solution is to start PG engine which should take little 
time to
study the configuration files and /common/data and fork a few 
childrens. But
I am still interested in optimizing this start-up cost.

For example, would I gain anything by starting PG engine on both A 
and B,
but on B I point it to /common/dummy and during the takeover, I 
somehow tell
it to now read from /common/data, for example have two 
postmaster.conf or
PGDATA and then use  pg_ctl reload.

Starting up PostgreSQL should be very fast, given no recovery to be 
done and decent hardware.

PostgreSQL does not fork a lot unless it is accepting new connections 
and if reading a config file is slow on your system, something else is 
broken.

In a active/passive setup your should be able to switch over in under 
3 seconds. If there was a lot of load on the failed node the recovery 
times on the new active node increase. The only thing you can do about 
that is getting faster disks..

Your startup time in the event of a failure will be predicated on the 
number of WAL files that need to be played back in order to perform 
auto-recovery.  For example, if you've set your checkpoint_segments to 
some high number, PostgreSQL will need to play back those WAL files to 
ensure that transactions that were committed to disk are correctly 
represented in your "on disk" object data (tables, indexes, etc).

Note that in the case of a "graceful" shutdown, you'll find that 
PostgreSQL doesn't need to replay WAL files (since it checkpoints prior 
to shutting down) and as such the startup time is pretty fast.

You can decrease the amount of time it takes to recover by decreasing 
the number of segments per checkpoint; however, this may result in more 
frequent checkpointing and as as result a reduction in overall 
performance on your active node.

Basically, you're balancing the "cost" of auto crash recovery with the 
frequency of checkpointing.

An alternative (and perhaps faster) method of failover would be to have 
a cluster in "warm standby" mode (this wouldn't rely on a shared pgdata 
directory).  In such cases you would only have to wait for the last few 
WAL files to be played back in order to recover.  In the case of a high 
checkpoint_segments number this would allow you to recover quickly from 
failures, but would introduce PITR overhead (copying WAL files when WAL 
files become full), but that's a pretty small cost - and you may already 
have that cost if you do PITR backups. 

There is no way to have postgresql "switch" data directories to speed up 
startup.

--
Chander Ganesan
Open Technology Group, Inc.
One Copley Parkway, Suite 210
Morrisville, NC  27560
919-463-0999/866-229-3386
http://www.otg-nc.com

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings