Re: [PATCH] enable core.fsyncObjectFiles by default

Jeff King <peff@xxxxxxxx> · Wed, 17 Jan 2018 15:55:09 -0500

On Wed, Jan 17, 2018 at 07:48:28PM +0100, Christoph Hellwig wrote:

> fsync is required for data integrity as there is no gurantee that
> data makes it to disk at any specified time without it.  Even for
> ext3 with data=ordered mode the file system will only commit all
> data at some point in time that is not guaranteed.
> 
> I've lost data on development machines with various times countless
> times due to the lack of this option, and now lost trees on a
> git server with ext4 as well yesterday.  It's time to make git
> safe by default.

I'm definitely sympathetic, and I've contemplated a patch like this a
few times. But I'm not sure we're "safe by default" here after this
patch. In particular:

  1. This covers only loose objects. We generally sync pack writes
     already, so we're covered there. But we do not sync ref updates at
     all, which we'd probably want to in a default-safe setup (a common
     post-crash symptom I've seen is zero-length ref files).

  2. Is it sufficient to fsync() the individual file's descriptors?
     We often do other filesystem operations (like hardlinking or
     renaming) that also need to be committed to disk before an
     operation can be considered saved.

  3. Related to (2), we often care about the order of metadata commits.
     E.g., a common sequence is:

       a. Write object contents to tempfile.

       b. rename() or hardlink tempfile to final name.

       c. Write object name into ref.lock file.

       d. rename() ref.lock to ref

     If we see (d) but not (b), then the result is a corrupted
     repository. Is this guaranteed by ext4 journaling with
     data=ordered?

It may be that data=ordered gets us what we need for (2) and (3). But I
think at the very least we should consider fsyncing ref updates based on
a config option, too.

-Peff