Part of the problem seems to be that the Github actions runner has a bug on OSX: https://github.com/actions/runner/issues/884 Based on investigating this for a while by setting up a self-hosted actions runner, it seems to have to do with a broken pipe triggering incomplete output capture / termination detection by either Github Action Runner ( see issue thread) or maybe even Dotnet Core's System.Diagnostics.Process functionality. As for the actual failing test, t9211, what I got on my machine was a failure during clone: `unknown repository extension found: refstorage`. In the trash directory, the .git/config did specify that extension. Perhaps some interference coming from t9500-gitweb-standalone-no-errors.sh, since it invokes: > git config extensions.refstorage "$refstorage" Cheers, Phil On Fri, May 17, 2024 at 7:30 AM Patrick Steinhardt <ps@xxxxxx> wrote: > > On Fri, May 17, 2024 at 10:25:20AM +0200, Patrick Steinhardt wrote: > > On Fri, May 17, 2024 at 04:11:32AM -0400, Jeff King wrote: > > > On Thu, May 16, 2024 at 02:36:19PM +0200, Patrick Steinhardt wrote: > > [snip] > > > One can guess that scalar is in waitpid() waiting for git-fetch. But > > > what's fetch waiting on? The other side of upload-pack is dead. > > > According to lsof, it does have a unix socket open to fsmonitor. So > > > maybe it's trying to read there? > > > > That was also my guess. I tried whether disabling fsmonitor via > > `core.fsmonitor=false` helps, but that did not seem to be the case. > > Either because it didn't have the desired effect, or because the root > > cause is not fsmonitor. No idea which of both it is. > > The root cause actually is the fsmonitor. I was using your tmate hack to > SSH into one of the failed jobs, and there had been 7 instances of the > fsmonitor lurking. After killing all of them the job got unstuck and ran > to completion. > > The reason why setting `core.fsmonitor=false` is ineffective is because > in "scalar.c" we always configure `core.fsmonitor=true` in the repo > config and thus override the setting. I was checking whether it would > make sense to defer enabling the fsmonitor until after the fetch and > checkout have concluded. But funny enough, the below patch caused the > pipeline to now hang deterministically. > > Puzzled. > > Patrick > > diff --git a/scalar.c b/scalar.c > index 7234049a1b..67f85c7adc 100644 > --- a/scalar.c > +++ b/scalar.c > @@ -178,13 +178,6 @@ static int set_recommended_config(int reconfigure) > config[i].key, config[i].value); > } > > - if (have_fsmonitor_support()) { > - struct scalar_config fsmonitor = { "core.fsmonitor", "true" }; > - if (set_scalar_config(&fsmonitor, reconfigure)) > - return error(_("could not configure %s=%s"), > - fsmonitor.key, fsmonitor.value); > - } > - > /* > * The `log.excludeDecoration` setting is special because it allows > * for multiple values. > @@ -539,6 +532,13 @@ static int cmd_clone(int argc, const char **argv) > if (res) > goto cleanup; > > + if (have_fsmonitor_support()) { > + struct scalar_config fsmonitor = { "core.fsmonitor", "true" }; > + if (set_scalar_config(&fsmonitor, 0)) > + return error(_("could not configure %s=%s"), > + fsmonitor.key, fsmonitor.value); > + } > + > res = register_dir(); > > cleanup: