One of the tests in t5616 asserts that git-fetch(1) with `--refetch` triggers repository maintenance with the correct set of arguments. This test is flaky and causes us to fail sometimes: ++ git -c protocol.version=0 -c gc.autoPackLimit=0 -c maintenance.incremental-repack.auto=1234 -C pc1 fetch --refetch origin error: unable to open .git/objects/pack/pack-029d08823bd8a8eab510ad6ac75c823cfd3ed31e.pack: No such file or directory fatal: unable to rename temporary file to '.git/objects/pack/pack-029d08823bd8a8eab510ad6ac75c823cfd3ed31e.pack' fatal: could not finish pack-objects to repack local links fatal: index-pack failed error: last command exited with $?=128 The error message is quite confusing as it talks about trying to rename a temporary packfile. A first hunch would thus be that this packfile gets written by git-fetch(1), but removed by git-maintenance(1) while it hasn't yet been finalized, which shouldn't ever happen. And indeed, when looking closer one notices that the file that is supposedly of temporary nature does not have the typical `tmp_pack_` prefix. As it turns out, the "unable to rename temporary file" fatal error is a red herring and the real error is "unable to open". That error is raised by `check_collision()`, which is called by `finalize_object_file()` when moving the new packfile into place. Because t5616 re-fetches objects, we end up with the exact same pack as we already have in the repository. So when the concurrent git-maintenance(1) process rewrites the preexisting pack and unlinks it exactly at the point in time where git-fetch(1) wants to check the old and new packfiles for equality we will see ENOENT and thus `check_collision()` returns an error, which gets bubbled up by `finalize_object_file()` and is then handled by `rename_tmp_packfile()`. That function does not know about the exact root cause of the error and instead just claims that the rename has failed. This race is thus caused by b1b8dfde69 (finalize_object_file(): implement collision check, 2024-09-26), where we have newly introduced the collision check. By definition, two files cannot collide with each other when one of them has been removed. We can thus trivially fix the issue by ignoring ENOENT when opening either of the files we're about to check for collision. Signed-off-by: Patrick Steinhardt <ps@xxxxxx> --- Hi, this patch is the follow-up for [1], where I've mentioned a couple of CI flakes that happen rather regularly. As it turns out, this race was a real bug hiding in the newly nitroduced object collision check in case one of the files got unlinked while performing the check. Thanks! Patrick [1]: <Z2-2dbYVuuLxpNmK@xxxxxx> --- object-file.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/object-file.c b/object-file.c index 5b792b3dd42cecde43a1b18abc164fd368cbcd69..f84dcd2f2a7b88716ab47bc00ee7a605a82e8d21 100644 --- a/object-file.c +++ b/object-file.c @@ -1978,13 +1978,15 @@ static int check_collision(const char *filename_a, const char *filename_b) fd_a = open(filename_a, O_RDONLY); if (fd_a < 0) { - ret = error_errno(_("unable to open %s"), filename_a); + if (errno != ENOENT) + ret = error_errno(_("unable to open %s"), filename_a); goto out; } fd_b = open(filename_b, O_RDONLY); if (fd_b < 0) { - ret = error_errno(_("unable to open %s"), filename_b); + if (errno != ENOENT) + ret = error_errno(_("unable to open %s"), filename_b); goto out; } --- base-commit: 306ab352f4e98f6809ce52fc4e5d63fb947d0635 change-id: 20241230-b4-pks-object-file-racy-collision-check-62a8d1588116