Thanks for the explanation, Tom. I understand the problem now.
My next question is this: what are the dangers of turning fsync off in the
context of a high-availablilty cluster using asynchronous replication?
In particular, we are using Slony-I and linux-ha to provide a two-node,
master-slave cluster. As you may know, Slony-I uses triggers to provide
asynchronous replication. If the master (X) fails, the slave (Y) becomes
active. At this point, the administrator manually performs a recovery by
reintroducing X so that Y is the master and X is the slave. This task
involves dropping any databases on X and having it sync with the versions
on Y. Thus, database corruption on X is irrelevant since our first step
is to drop them.
It would seem that our only exposure is that both machines fail before the
administrator is able to perform the recovery. Even that could be solved
by leaving fsync turned on for the slave, so that when failover occurs and
the slave becomes active, we only turn fsync off once we've safely
reintroduced the other machine (which, in turn will have fsync turned on).
There was a discussion about this here:
http://gborg.postgresql.org/pipermail/slony1-general/2005-March/001760.html
However, that discussion seems to assume that the administrator needs to
salvage the databases on the failed machine, which is not necessary in
our case.
In short, is there any danger (besides losing a few transactions) of
turning fsync off on the master of a cluster using asynchronous
replication, assuming we don't need to recover the data from the master
when it fails?
Thanks.
- Joel