Re: [PATCH v2 5/7] builtin/gc: add a `--detach` flag

Patrick Steinhardt <ps@xxxxxx> · Fri, 16 Aug 2024 10:06:48 +0200

On Thu, Aug 15, 2024 at 03:29:20PM -0700, Junio C Hamano wrote:
> Junio C Hamano <gitster@xxxxxxxxx> writes:
> 
> > Patrick Steinhardt <ps@xxxxxx> writes:
> >
> >> +test_expect_success '--detach overrides gc.autoDetach=false' '
> >> +	test_when_finished "rm -rf repo" &&
> >> +	git init repo &&
> >> +	(
> >> +		cd repo &&
> >> +
> >> +		# Prepare the repository such that git-gc(1) ends up repacking.
> >> +		test_commit "$(test_oid blob17_1)" &&
> >> +		test_commit "$(test_oid blob17_2)" &&
> >> +		git config gc.autodetach false &&
> >> +		git config gc.auto 2 &&
> >> +
> >> +		cat >expect <<-EOF &&
> >> +		Auto packing the repository in background for optimum performance.
> >> +		See "git help gc" for manual housekeeping.
> >> +		EOF
> >> +		GIT_PROGRESS_DELAY=0 git gc --auto --detach 2>actual &&
> >> +		test_cmp expect actual
> >> +	)
> >> +'
> >
> > If the gc/maintenance is going to background itself, it is possible
> > that it still is running, possibly with files under repo/.git/ open
> > and the process running in repo directory, when the test_when_finished
> > clean-up trap goes in effect?
> >
> > I am wondering where this comes from:
> >
> >   https://github.com/git/git/actions/runs/10408467351/job/28825980833#step:6:2000
> >
> > where "rm -rf repo" dies with an unusual
> >
> >   rm: can't remove 'repo/.git': Directory not empty
> >
> > and my theory is that after "rm -rf" _thinks_ it removed everything
> > underneath, before it attempts to rmdir("repo/.git"), the repack
> > process in the background has created a new pack, and "rm -rf" does
> > not go back and try to create such a new cruft.
> >
> > The most robust way to work around such a "race" is to wait for the
> > backgrounded process before cleaning up, or after seeing that the
> > message we use as a signal that the "gc" has backgrounded itself,
> > kill that backgrounded process before exiting the test and causing
> > the clean-up to trigger.
> 
> There already is a clue left by those who worked on this test the
> last time at the end of the script.  It says:
> 
>     # DO NOT leave a detached auto gc process running near the end of the
>     # test script: it can run long enough in the background to racily
>     # interfere with the cleanup in 'test_done'.
> 
> immediately before "test_done".
> 
> In the meantime, I am wondering something simple and silly like the
> attached is sufficient.  The idea is that we expect the "oops we
> couldn't clean" code not to trigger most of the time, but if it
> does, we just wait (with back off) a bit and retry.

Ah, indeed, that is a problem. We already have a better tool to fix this
with `run_and_wait_for_auto_gc()`. It creates a separate file descriptor
and waits for it to close via some shell trickery, which will only
happen once the child process of git-gc(1) has exited.

Will fix, thanks!

Patrick