Re: BUG: Race condition due to reflog expiry in "gc"

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Wed, 13 Mar 2019 11:28:39 +0100

On Wed, Mar 13 2019, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes:
>
>> I'm still working on fixing a race condition I encountered in "gc"
>> recently, but am not 100% sure of the fix. So I thought I'd send a
>> braindump of what I have so far in case it jolts any memories.
>>
>> The problem is that when we "gc" we run "reflog expire --all". This
>> iterates over the reflogs, and takes a *.lock for each reference.
>>
>> It'll fail intermittendly in two ways:
>>
>>  1. If something is concurrently committing to the repo it'll fail
>>     because we for a tiny amount of time hold a HEAD.lock file, so HEAD
>>     can't be updated.
>>
>>  2. On a repository that's just being "git fetch"'d by some concurrent
>>     process the "gc" will fail, because the ref's SHA1 has changed,
>>     which we inspect as we aquire the lock.
>
> Both sounds very much expected and expectable outcome.  I am not
> sure how they need to be called bugs.

Let's leave aside that I started the subject with "BUG:" and let me
rephrase.

I was under the impression that git-gc was supposed to support operating
on a repository that's concurrently being modified, as long as you don't
set the likes of gc.pruneExpire too aggressively.

Running a "gc" in a loop without "git reflog expire --all" and when
watching the repository being GC'd with:

    fswatch -l 0.1 -t -r . 2>&1 | grep lock

We only create .git/MERGE_RR.lock, .git/gc.pid.lock and
git/packed-refs.lock. These are all things that would only cause another
concurrent GC to fail, not a normal git command.

So the only reason a concurrent commit (case #1) fails is because of the
refs being locked during the reflog iteration, and similarly "gc" itself
will fail due to a concurrently updating ref (case #2).

It seems that first of all we need this, I'll submit that as a separate
patch sometime soon:

    diff --git a/builtin/gc.c b/builtin/gc.c
    index 020f725acc..ae488646e1 100644
    --- a/builtin/gc.c
    +++ b/builtin/gc.c
    @@ -127,6 +127,12 @@ static void gc_config(void)
     			pack_refs = git_config_bool("gc.packrefs", value);
     	}

    +	if (!git_config_get_value("gc.reflogexpire", &value) && value &&
    +	    !strcmp(value, "never") &&
    +	    !git_config_get_value("gc.reflogexpireunreachable", &value) && value &&
    +	    !strcmp(value, "never"))
    +		prune_reflogs = 0;
    +
     	git_config_get_int("gc.aggressivewindow", &aggressive_window);
     	git_config_get_int("gc.aggressivedepth", &aggressive_depth);
     	git_config_get_int("gc.auto", &gc_auto_threshold);

I.e. now even if your gc.* config says you don't want the reflogs
touched, we still pointlessly iterate over all of them. The case I'm
running into (a variant of #2) is one solved by that patch, i.e. I'm
fine "gc" just having the reflogs kept forever as a workaround in this
case.

Something like that should have been added back in 62aad1849f ("gc
--auto: do not lock refs in the background", 2014-05-25), i.e. now the
"prune_reflogs" variable is never used, it's just cargo-culted from a
copy/pasting of the "pack_refs" code.

In other "gc" phases in "pack-objects" and "prune" we also look at the
reflogs. This obviously bad patch ignores them entirely:

    diff --git a/builtin/prune.c b/builtin/prune.c
    index 97613eccb5..bccee7813e 100644
    --- a/builtin/prune.c
    +++ b/builtin/prune.c
    @@ -41,7 +41,7 @@ static void perform_reachability_traversal(struct rev_info *revs)

     	if (show_progress)
     		progress = start_delayed_progress(_("Checking connectivity"), 0);
    -	mark_reachable_objects(revs, 1, expire, progress);
    +	mark_reachable_objects(revs, 0, expire, progress);
     	stop_progress(&progress);
     	initialized = 1;
     }
    diff --git a/builtin/repack.c b/builtin/repack.c
    index 67f8978043..618ffbfe0a 100644
    --- a/builtin/repack.c
    +++ b/builtin/repack.c
    @@ -364,7 +364,6 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
     				 keep_pack_list.items[i].string);
     	argv_array_push(&cmd.args, "--non-empty");
     	argv_array_push(&cmd.args, "--all");
    -	argv_array_push(&cmd.args, "--reflog");
     	argv_array_push(&cmd.args, "--indexed-objects");
     	if (repository_format_partial_clone)
     		argv_array_push(&cmd.args, "--exclude-promisor-objects");

I'm just including that as illustration that add_reflogs_to_pending() in
revision.c during "gc" already iterates over the reflogs without locking
anything, but of course it's just reading them.

So one thing that would mitigate things a lot is if
files_reflog_expire() and its call to expire_reflog_ent() via
refs_for_each_reflog_ent() would lazily aquire the lock on the ref.

Digging a bit further that's actually what we're doing now since
4ff0f01cb7 ("refs: retry acquiring reference locks for 100ms",
2017-08-21).

But this runs into the logic we've had for a long time, or since your
bda3a31cc7 ("reflog-expire: Avoid creating new files in a directory
inside readdir(3) loop", 2008-01-25) where we first loop over all the
refs in the process of finding the reflogs, and then will try to lock
those refs at those expected SHA-1s. If they've changed in the meantime
we error out don't clean up the lockfile.

So just this fixes that:

    diff --git a/refs/files-backend.c b/refs/files-backend.c
    index ef053f716c..b6576f28b9 100644
    --- a/refs/files-backend.c
    +++ b/refs/files-backend.c
    @@ -3037,7 +3037,7 @@ static int files_reflog_expire(struct ref_store *ref_store,
     	 * reference itself, plus we might need to update the
     	 * reference if --updateref was specified:
     	 */
    -	lock = lock_ref_oid_basic(refs, refname, oid,
    +	lock = lock_ref_oid_basic(refs, refname, NULL,
     				  NULL, NULL, REF_NO_DEREF,
     				  &type, &err);
     	if (!lock) {

Which seems sensible to me. We'll still get the lock, we just don't
assert that the refname we use to get the lock must be at that
SHA-1. We'll still use it for the purposes of expiry.

But maybe I've missed some caveat in reflog_expiry_prepare() and friends
and we really do need the reflog at that OID, then this would suck less:

    diff --git a/builtin/reflog.c b/builtin/reflog.c
    index 4d3430900d..4bb272fdc8 100644
    --- a/builtin/reflog.c
    +++ b/builtin/reflog.c
    @@ -625,12 +625,16 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
     		free_worktrees(worktrees);
     		for (i = 0; i < collected.nr; i++) {
     			struct collected_reflog *e = collected.e[i];
    +			int st;
     			set_reflog_expiry_param(&cb.cmd, explicit_expiry, e->reflog);
    -			status |= reflog_expire(e->reflog, &e->oid, flags,
    -						reflog_expiry_prepare,
    -						should_expire_reflog_ent,
    -						reflog_expiry_cleanup,
    -						&cb);
    +			st = reflog_expire(e->reflog, &e->oid, flags,
    +					   reflog_expiry_prepare,
    +					   should_expire_reflog_ent,
    +					   reflog_expiry_cleanup,
    +					   &cb);
    +			if (st == -2)
    +				continue;
    +			status |= st;
     			free(e);
     		}
     		free(collected.e);
    diff --git a/refs/files-backend.c b/refs/files-backend.c
    index ef053f716c..8b0b6b7b85 100644
    --- a/refs/files-backend.c
    +++ b/refs/files-backend.c
    @@ -3041,6 +3041,11 @@ static int files_reflog_expire(struct ref_store *ref_store,
     				  NULL, NULL, REF_NO_DEREF,
     				  &type, &err);
     	if (!lock) {
    +		if (errno == EBUSY) {
    +			warning("cannot lock ref '%s': %s. Skipping!", refname, err.buf);
    +			strbuf_release(&err);
    +			return -2;
    +		}
     		error("cannot lock ref '%s': %s", refname, err.buf);
     		strbuf_release(&err);
     		return -1;

I.e. we just detect the EBUSY that verify_lock() sets if the OID doesn't
match, and don't prune that reflog. As seen above "pack-objects" and
"prune" will still iterate over the same logs later for the purposes of
reachability, so this shouldn't get us into a corrupt state due to
throwing away objects referenced in those logs, we'll just prune fewer
things than we could have.

So I think I'll use the first patch noted above as a hack to solve the
narrow problem I have now, but any comments on the above most
welcome. I'm not very familiar with the ref code in case that wasn't
obvious already.

B.t.w. the mention of f3b661f766 ("expire_reflog(): use a lock_file for
rewriting the reflog file", 2014-12-12) upthread is irrelevant. That's a
commit where we use the lockfile code to write out the *new* reflog,
which is unrelated to all of this.