Cheers
Medi
On 8/16/07, Chander Ganesan <chander@xxxxxxxxxx> wrote:
Hannes Dorbath wrote:
> On 15.08.2007 21:30, Medi Montaseri wrote:
>> I am looking for some suggestions to optimize the following
>> problem/solution.
>>
>> Given two nodes A and B (two computers) in a active-passive mode
>> where A is
>> running PG engine, and B is simply standing by, and a common storage
>> (twin tailed) ( or from pg_ctl point of view -D /common/data ), I am
>> looking
>> for a faster solution during the takeover where A has crashed and B
>> is to
>> start PG engine and run with it.
>>
>> My current solution is to start PG engine which should take little
>> time to
>> study the configuration files and /common/data and fork a few
>> childrens. But
>> I am still interested in optimizing this start-up cost.
>>
>> For example, would I gain anything by starting PG engine on both A
>> and B,
>> but on B I point it to /common/dummy and during the takeover, I
>> somehow tell
>> it to now read from /common/data, for example have two
>> postmaster.conf or
>> PGDATA and then use pg_ctl reload.
>
> Starting up PostgreSQL should be very fast, given no recovery to be
> done and decent hardware.
>
> PostgreSQL does not fork a lot unless it is accepting new connections
> and if reading a config file is slow on your system, something else is
> broken.
>
> In a active/passive setup your should be able to switch over in under
> 3 seconds. If there was a lot of load on the failed node the recovery
> times on the new active node increase. The only thing you can do about
> that is getting faster disks..
>
Your startup time in the event of a failure will be predicated on the
number of WAL files that need to be played back in order to perform
auto-recovery. For example, if you've set your checkpoint_segments to
some high number, PostgreSQL will need to play back those WAL files to
ensure that transactions that were committed to disk are correctly
represented in your "on disk" object data (tables, indexes, etc).
Note that in the case of a "graceful" shutdown, you'll find that
PostgreSQL doesn't need to replay WAL files (since it checkpoints prior
to shutting down) and as such the startup time is pretty fast.
You can decrease the amount of time it takes to recover by decreasing
the number of segments per checkpoint; however, this may result in more
frequent checkpointing and as as result a reduction in overall
performance on your active node.
Basically, you're balancing the "cost" of auto crash recovery with the
frequency of checkpointing.
An alternative (and perhaps faster) method of failover would be to have
a cluster in "warm standby" mode (this wouldn't rely on a shared pgdata
directory). In such cases you would only have to wait for the last few
WAL files to be played back in order to recover. In the case of a high
checkpoint_segments number this would allow you to recover quickly from
failures, but would introduce PITR overhead (copying WAL files when WAL
files become full), but that's a pretty small cost - and you may already
have that cost if you do PITR backups.
There is no way to have postgresql "switch" data directories to speed up
startup.
--
Chander Ganesan
Open Technology Group, Inc.
One Copley Parkway, Suite 210
Morrisville, NC 27560
919-463-0999/866-229-3386
http://www.otg-nc.com