Re: broken repo after power cut

"Theodore Ts'o" <tytso@xxxxxxx> · Sun, 21 Jun 2015 20:35:51 -0400

On Sun, Jun 21, 2015 at 03:07:41PM +0200, Richard Weinberger wrote:

> > I was then shocked to learn that ext4 apparently has a default
> > setting that allows it to truncate files upon power failure
> > (something about a full journal vs a fast journal or some such)

s/ext4/all modern file systems/

POSIX makes **no guarantees** about what happens after a power failure
unless you use fsync() --- which git does not do by default (see below).

> You mean the ext4 delayed block allocation feature/issue?
> IIRC Ted added some hacks to ext4 to detect misbehaving applications (Gnome and KDE).
> But to my knowledge such an file corruption must not happen if the application behaves well. And it can happen on all file systems.
> Ted, maybe you can help us? BTW: I'm using ext4's default mount options from openSUSE, data=ordered.

The hacks (which were agreed upon by all of the major file system
developers --- ext4, btfs, xfs --- at the Linux File Systems and
Storage summit a couple of years ago --- protects against the default
text editors of GNOME and KDE which were saving file without using
fsync(), and in one particularly egregious example (although I don't
remember which program was doing this), updated files by opening the
file with O_TRUNC and then rewritng the new contents of the file.  So
if you crashed just after the open(2), and before the file data was
written, you were guaranteed to lose data.

The hack protects against data loss when programs updated a file
incompetently.  What we agreed to do was that upon renaming a fileA on
top of another fileB, there is an implicit writeback initiated of
fileA.  If the program properly called fsync(2) before closing the
file descriptor for fileA and doing the rename, this implicit
writeback would be no-op.  Simiarly, if a file descriptor was opened
with O_TRUNC, when the file descriptor is closed, we start an implicit
writeback at that point.  Note that this is not the same as a full
fsync; it merely closes the race window from 30 seconds down to a
second or so (depending on how busy the disk is).

But this hack does not protect against freshly written files, which is
the case of git object files or git pack files.  The basic idea here
is that you could have just as easily crashed before the commit as
after the commit, and doing an implicit writeback after all file
closes would have destroyed performance and penalized progams that
didn't really care so much about the file hitting disk.  (For example,
if you do a compile, and you crash, it's not such a big deal.)

The bottome lins is that if you care about files being written, you
need to use fsync().  Should git use fsync() by default?  Well, if you
are willing to accept that if your system crashes within a second or
so of your last git operation, you might need to run "git fsck" and
potentially recover from a busted repo, maybe speed is more important
for you (and git is known for its speed/performance, after all. :-)

The actual state of the source tree would have been written using a
text editor which tends to be paranoid about using fsync (at least, if
you use a real editor like Emacs or Vi, as opposed to the toy notepad
editors shipped with GNOME or KDE :-).  So as long as you know what
you're doing, it's unlikely that you will actually lose any work.

Personally, I have core.fsyncobjectfiles set to yes in my .gitconfig.
Part of this is because I have an SSD, so the speed hit really doesn't
bother me, and needing to recover a corrupted git repository is a pain
(although I have certainly done it in the past).

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in