Re: Dangers of fsync = off

Joel Dice <dicej@xxxxxxxxxxxxx> · Fri, 4 May 2007 08:54:10 -0600 (MDT)

Thanks for the explanation, Tom.  I understand the problem now.

My next question is this: what are the dangers of turning fsync off in the 
context of a high-availablilty cluster using asynchronous replication?

In particular, we are using Slony-I and linux-ha to provide a two-node, 
master-slave cluster.  As you may know, Slony-I uses triggers to provide 
asynchronous replication.  If the master (X) fails, the slave (Y) becomes 
active.  At this point, the administrator manually performs a recovery by 
reintroducing X so that Y is the master and X is the slave.  This task 
involves dropping any databases on X and having it sync with the versions 
on Y.  Thus, database corruption on X is irrelevant since our first step 
is to drop them.

It would seem that our only exposure is that both machines fail before the 
administrator is able to perform the recovery.  Even that could be solved 
by leaving fsync turned on for the slave, so that when failover occurs and 
the slave becomes active, we only turn fsync off once we've safely 
reintroduced the other machine (which, in turn will have fsync turned on).

There was a discussion about this here:

  http://gborg.postgresql.org/pipermail/slony1-general/2005-March/001760.html

However, that discussion seems to assume that the administrator needs to 
salvage the databases on the failed machine, which is not necessary in 
our case.

In short, is there any danger (besides losing a few transactions) of 
turning fsync off on the master of a cluster using asynchronous 
replication, assuming we don't need to recover the data from the master 
when it fails?

Thanks.

 - Joel