In response to Joel Dice <dicej@xxxxxxxxxxxxx>: > Thanks for your response, Andrew. > > On Tue, 8 May 2007, Andrew Sullivan wrote: > > > On Fri, May 04, 2007 at 08:54:10AM -0600, Joel Dice wrote: > >> > >> My next question is this: what are the dangers of turning fsync off in the > >> context of a high-availablilty cluster using asynchronous replication? > > > > My real question is why you want to turn it off. If you're using a > > battery-backed cache on your disk controller, then fsync ought to be > > pretty close to free. Are you sure that turning it off will deliver > > the benefit you think it will? > > You may very well be right. I tend to think in terms of software > solutions, but a hardware solution may be most appropriate here. In any > case, I'm not at all sure this will bring a significant peformance > improvement. I just want to understand the implications before I start > fiddling; if fsync=off is dangerous, it doesn't matter what the > performance benefits may be. > > >> on Y. Thus, database corruption on X is irrelevant since our first step > >> is to drop them. > > > > Not if the corruption introduces problems for replication, which is > > indeed possible. > > That's exactly what I want to understand. How, exactly, is this possible? > If the danger of fsync is that it may leave the on-disk state of the > database in an inconsistent state after a crash, it would not seem to have > any implications for activity occurring prior to the crash. In > particular, a trigger-based replication system would seem to be immune. If you mean Slony, no. It's not immune. Slony maintains its state in tables in the database. If fsync is off, there's no guarantee that Slony's state information is sane, which means replication is not guaranteed to be or do anything. > In other words, while there may be ways the master could cause corruption > on the slave, I don't see how they could be related to the fsync setting. Specifically, I can imagine a system crashing, then _seeming_ to restart properly, but Slony starts re-replicating transactions that have already been replicated once because the ACKs were never written to disk on the master. Take the example of a query "UPDATE tablename SET x = x + 1". When this query is erroneously issued twice, data corruption will occur. Other scenarios may be possible. -- Bill Moran http://www.potentialtech.com