Re: Dangers of fsync = off

Tom Lane <tgl@xxxxxxxxxxxxx> · Thu, 03 May 2007 22:30:12 -0400

Joel Dice <dicej@xxxxxxxxxxxxx> writes:
> It's clear from the documentation for the fsync configuration option that 
> turning it off may lead to unrecoverable data corruption.  I'd like to 
> learn more about why this is possible and how likely it really is.

As you note, WAL is not particularly vulnerable --- the worst likely
consequence is not being able to read the last few WAL entries that
were made.

The real problem with fsync off is that there is essentially no
guarantee about the relative write order of WAL and data files.
In particular, some data-file updates might hit disk before the
corresponding WAL entries.  If other data-file updates part of
the same transaction did *not* reach disk before a crash, then
replay of WAL might not cause those updates to happen (because
the relevant WAL records are unreadable), leaving you with
inconsistent data.

Another scenario is that a checkpoint is shown as completed by WAL but
not all of the before-the-checkpoint data-file updates actually reached
disk.  WAL replay will start from the checkpoint and therefore not fix
the missing updates.

Either way you have inconsistencies in on-disk data, such as missing
tuples, multiple live versions of the same tuple, index contents not
consistent with heap, or outright-corrupt index structure.  The extent
to which these things are visible to applications is hard to predict,
but it's frequently ugly :-(.  Index problems can always be fixed with
REINDEX, but there's no fix for inconsistent heap contents.

			regards, tom lane