On 2024.09.27 00:07, Jeff King wrote: > On Tue, Sep 24, 2024 at 07:33:08AM +0200, Patrick Steinhardt wrote: > > > +test_expect_success 'ref transaction: many concurrent writers' ' > > + test_when_finished "rm -rf repo" && > > + git init repo && > > + ( > > + cd repo && > > + # Set a high timeout such that a busy CI machine will not abort > > + # early. 10 seconds should hopefully be ample of time to make > > + # this non-flaky. > > + git config set reftable.lockTimeout 10000 && > > I saw this test racily fail in the Windows CI build. The failure is as > you might imagine, a few of the background update-ref invocations > failed: > > fatal: update_ref failed for ref 'refs/heads/branch-21': reftable: transaction failure: I/O error > > but of course we don't notice because they're backgrounded. And then the > expected output is missing the branch-21 entry (and in my case, > branch-64 suffered a similar fate). > > At first I thought we probably needed to bump the timeout (and EIO was > just our way of passing that up the stack). But looking at the > timestamps in the Actions log, the whole loop took less than 10ms to > run. > > So could this be indicative of a real contention issue specific to > Windows? I'm wondering if something like the old "you can't delete a > file somebody else has open" restriction is biting us somehow. > > -Peff We're seeing repeated failures from this test case with ASan enabled. Unfortunately, we've only been able to reproduce this on our $DAYJOB-specific build system. I haven't been able to get it to fail using just the upstream Makefile so far. I'll keep trying to find a way to reproduce this. FWIW, we're not getting I/O errors, we see the following: fatal: update_ref failed for ref 'refs/heads/branch-20': cannot lock references We tried increasing the timeout in the test to 2 minutes (up from 10s), but it didn't fix the failures.