Re: [PATCH] allow disabling fsync everywhere

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Fri, 29 Oct 2021 13:20:29 +0200

On Thu, Oct 28 2021, Eric Wong wrote:

> "core.fsync" and the "GIT_FSYNC" environment variable now exist
> for disabling fsync() even on packfiles and other critical data.

First off, I think this would be useful to add, even as a non-test
thing.

There's plenty of use-cases where users don't care about fsync(), and
aside from data reliability it also has implications for say laptop
battery use.

I care about my E-Mail, but my E-Mail syncing thingy's (mbsync) has an
fsync=off in its config, I'm pretty confident that I won't lose power on
that *nix laptop, and even if I did I have other recovery methods.

I wouldn't use this in general, but might for some operations, e.g. when
I'm doing some batch operation with git-annex or whatever.

> Running "make test -j8 NO_SVN_TESTS=1" on a noisy 8-core system
> on an HDD, adding "GIT_FSYNC=0" to the invocation brings my test
> runtime from ~4 minutes down to ~3 minutes.

On a related topic: Have you tried having it use an empty template
directory? I have some unsubmitted patches for that, and if you retain
all trash dirs you'll find that we have IIRC 100-200MB of just
accumulated *.sample etc. hooks, out of ~1GB total for the tests.

> +core.fsync::
> +	A boolean to control 'fsync()' on all files.
> ++
> +Setting this to false can speed up writes of unimportant data.
> +Disabling fsync will lead to data loss on power failure.  If set
> +to false, this also overrides core.fsyncObjectFiles.  Defaults to true.
> +

Per the recent threads about fsync semantics this should really be
worded to be more scary, i.e. it's not guaranteed that your data is
on-disk at all. So subsequent git operations might see repo corruption,
if we're being POSIX-pedantic.

So some more "all bets are off" wording would be better here, and any
extra promises could be e.g.:

    On commonly deployed OS's such as Linux, OSX etc. a lack of fsync()
    gives you extra guarantees that are non-standard, primarily that a
    subsequent running process that accesses the data will see the
    updated version, as the VFS will serve up the new version, even if
    it's not on-disk yet. Even these light guarantees may not be
    something you're getting from our OS. Here be dragons!

> [...]
> --- a/git-cvsserver.perl
> +++ b/git-cvsserver.perl
> @@ -3607,6 +3607,25 @@ package GITCVS::updater;
>  use strict;
>  use warnings;
>  use DBI;
> +our $_use_fsync;

s/our/my/?

> +# n.b. consider using Git.pm
> +sub use_fsync {
> +    if (!defined($_use_fsync)) {
> +        my $x = $ENV{GIT_FSYNC};
> +        if (defined $x) {
> +            local $ENV{GIT_CONFIG};
> +            delete $ENV{GIT_CONFIG};
> +            my $v = ::safe_pipe_capture('git', '-c', "core.fsync=$x",
> +                                        qw(config --type=bool core.fsync));
> +            $_use_fsync = defined($v) ? ($v eq "true\n") : 1;
> +        } else {
> +            my $v = `git config --type=bool core.fsync`;
> +            $_use_fsync = $v eq "false\n" ? 0 : 1;
> +        }
> +    }
> +    $_use_fsync;
> +}

I wonder most of all if we really need these perl changes, does it
really matter for the overall performance that you want, or was it just
to get all fsync() uses out of the test suite?

This is a more general comment, but so much of scaffolding like that in
the *.perl scripts could go away if we taught git.c some "not quite a
builtin, but it's ours" mode, and had say code for git-send-email,
git-svn etc. to just set the various data they need in the environment
before we invoke them. Then this would just be say:

    our $use_fsync = $ENV{GIT_FOR_CVSSERVER_FSYNC};

> +    if ($self->{dbdriver} eq 'SQLite' && !use_fsync()) {
> +        $self->{dbh}->do('PRAGMA synchronous = OFF');
> +    }

..in particular if we're doing this we should update the docs to say
that we'll also tweak third-party software's use of fsync() in some
cases.

Even if we "own" that sqlite DB a user might not expect a git option to
have gone so far as to be the cause of their on-disk sqlite DB's
corruption.

> [...]
>  	my $sync;
>  	# both of these options make our .rev_db file very, very important
>  	# and we can't afford to lose it because rebuild() won't work
> -	if ($self->use_svm_props || $self->no_metadata) {
> +	if (($self->use_svm_props || $self->no_metadata) && use_fsync()) {
>  		require File::Copy;
>  		$sync = 1;
>  		File::Copy::copy($db, $db_lock) or die "rev_map_set(@_): ",

This doesn't just impact $io->sync, but also $io->flush, which is a
different thing. PerlIO flushing to the OS != fsync().

So even on OS's that make the above noted guarantees you'd get nothing,
since we're now at the mercy of perl not crashing in the interim, not
kernel or stdlib behavior.

> @@ -57,7 +58,9 @@ void fprintf_or_die(FILE *f, const char *fmt, ...)
>  
>  void fsync_or_die(int fd, const char *msg)
>  {
> -	while (fsync(fd) < 0) {
> +	if (use_fsync < 0)
> +		use_fsync = git_env_bool("GIT_FSYNC", 1);
> +	while (use_fsync && fsync(fd) < 0) {
>  		if (errno != EINTR)
>  			die_errno("fsync error on '%s'", msg);
>  	}

This patch direction looks good to me, aside from the above we should
really update any other fsync() docs we have, maybe with a
cross-reference in core.fsyncObjectFiles?

I'm not sure offhand what the state of the other fsync patches that
Neeraj Singh has been submitting is, but let's make sure that whatever
docs/config knobs etc. we have all play nicely together, and are
somewhat future-proof if possible.