On Sat, Aug 17, 2024 at 03:09:24AM -0400, Jeff King wrote: > On Fri, Aug 16, 2024 at 12:45:15PM +0200, Patrick Steinhardt wrote: > > > +test_expect_success '--no-detach causes maintenance to not run in background' ' > > [...] > > + # We have no better way to check whether or not the task ran in > > + # the background than to verify whether it output anything. The > > + # next testcase checks the reverse, making this somewhat safer. > > + git maintenance run --no-detach >out 2>&1 && > > + test_line_count = 1 out > > [...] > > +test_expect_success '--detach causes maintenance to run in background' ' > > + test_when_finished "rm -rf repo" && > > + git init repo && > > + ( > > + cd repo && > > + > > + test_commit something && > > + git config set maintenance.gc.enabled false && > > + git config set maintenance.loose-objects.enabled true && > > + git config set maintenance.loose-objects.auto 1 && > > + git config set maintenance.incremental-repack.enabled true && > > + > > + git maintenance run --detach >out 2>&1 && > > + test_must_be_empty out > > + ) > > +' > > This second test seems to fail racily (or maybe always? see below). In > CI on Windows, I saw: > > 'out' is not empty, it contains: > fc9fea69579f349e3b02e3264cffbef03e4b1852 > > That would make sense to me if the detached process still held the > original stdout/stderr channel open (in which case we'd racily see the > same line as in the no-detach case). But we do appear to call > daemonize(), which closes both. > > Curiously, the code in gc.c does this: > > /* Failure to daemonize is ok, we'll continue in foreground. */ > if (opts->detach > 0) > daemonize(); > > and the only way for daemonize to fail is if NO_POSIX_GOODIES is set. > Which I'd expect on Windows. But then I'd expect this test to _always_ > fail on Windows. Does it? If so, should it be marked with !MINGW? > > While investigating that, I ran it with --stress locally (on Linux) and > got some odd (and definitely racy) results. The test itself passes, but > the "rm -rf repo" in the test_when_finished sometimes fails with: > > rm: cannot remove 'repo/.git/objects': Directory not empty > > or similar (sometimes it's another directory like 'repo/.git'). My guess > is that the background process is still running and creating files in > the repository, racing with rm's call to rmdir(). > > Even if we remove the test_when_finished, it would mean that the final > cleanup after test_done might similarly fail, leaving a crufty trash > directory. I think to make this robust, we'd need some way of detecting > when the background process has finished. I don't think we report the > pid anywhere, and the daemonize() call means it won't even be in the > same process group. Maybe we could spin looking for the incremental pack > it will create (and timeout after N seconds)? That feels pretty hacky, > but I can't think of anything better. Oh, good catch indeed, I can reproduce this on Linux with `--stress`. I think we can use the same workaround as we do in t6500, namely open a new file descriptor, inherit it and then wait for the child process to close it. Thanks! Patrick