Hi, this small patch series adapts the race fix for collision checks when moving object files into place [1] to retry linking the object into place instead of silently ignoring the error, as suggested by Peff in [2]. The series at [1] has already been merged into 'next', so this is built on top of 1b4e9a5f8b (Merge branch 'ps/build-meson-html', 2025-01-02) with ps/object-collision-check at 0ad3d65652 (object-file: fix race in object collision check, 2024-12-30) merged into it. Changes in v2: - Add retry counter so that we eventually abort in case the colliding files repeatedly reappears and vanishes again. - Evict the change to stop handling ENOENT for the source file specially into a separate commit. - Commit message improvements. - Link to v1: https://lore.kernel.org/r/20250103-b4-pks-object-file-racy-collision-check-v1-0-6ef9e2da1f87@xxxxxx Thanks! Patrick [1]: <20241230-b4-pks-object-file-racy-collision-check-v1-1-11571294e60a@xxxxxx> [2]: <20241231014220.GA225521@xxxxxxxxxxxxxxxxxxxxxxx> --- Patrick Steinhardt (3): object-file: rename variables in `check_collision()` object-file: don't special-case missing source file in collision check object-file: retry linking file into place when occluding file vanishes object-file.c | 66 +++++++++++++++++++++++++++++++++++++---------------------- 1 file changed, 41 insertions(+), 25 deletions(-) Range-diff versus v1: 1: 389b9e4180 = 1: 0c38089e9f object-file: rename variables in `check_collision()` -: ---------- > 2: d4a7ec11c8 object-file: don't special-case missing source file in collision check 2: 91a8bdf37d ! 3: 587fd01fb6 object-file: retry linking file into place when occluding file vanishes @@ Metadata ## Commit message ## object-file: retry linking file into place when occluding file vanishes - In a preceding commit, we have adapted `check_collision()` to ignore - the case where either of the colliding files vanishes. This should be - safe in general when we assume that the contents of these two files were - the same. But the check is all about detecting collisions, so that - assumption may be too optimistic. + Prior to 0ad3d65652 (object-file: fix race in object collision check, + 2024-12-30), callers could expect that a successful return from + `finalize_object_file()` means that either the file was moved into + place, or the identical bytes were already present. If neither of those + happens, we'd return an error. - Adapt the code to retry linking the object into place when we see that - the destination file has racily vanished. This should generally succeed - as we have just observed that the destination file does not exist - anymore, except in the very unlikely event that it gets recreated by - another concurrent process again. + Since that commit, if the destination file disappears between our + link(3p) call and the collision check, we'd return success without + actually checking the contents, and without retrying the link. This + solves the common case that the files were indeed the same, but it means + that we may corrupt the repository if they weren't (this implies a hash + collision, but the whole point of this function is protecting against + hash collisions). - Furthermore, stop treating `ENOENT` specially for the source file. It - shouldn't happen that the source vanishes as we're using a fresh - temporary file for it, so if it does vanish it indicates an actual - error. + We can't be pessimistic and assume they're different; that hurts the + common case that the mentioned commit was trying to fix. But after + seeing that the destination file went away, we can retry linking again. + Adapt the code to do so when we see that the destination file has racily + vanished. This should generally succeed as we have just observed that + the destination file does not exist anymore, except in the very unlikely + event that it gets recreated by another concurrent process again. - Suggested-by: Jeff King <peff@xxxxxxxx> + Helped-by: Jeff King <peff@xxxxxxxx> Signed-off-by: Patrick Steinhardt <ps@xxxxxx> ## object-file.c ## @@ object-file.c: static void write_object_file_prepare_literally(const struct git_ static int check_collision(const char *source, const char *dest) { char buf_source[4096], buf_dest[4096]; -@@ object-file.c: static int check_collision(const char *source, const char *dest) - - fd_source = open(source, O_RDONLY); - if (fd_source < 0) { -- if (errno != ENOENT) -- ret = error_errno(_("unable to open %s"), source); -+ ret = error_errno(_("unable to open %s"), source); - goto out; - } - @@ object-file.c: static int check_collision(const char *source, const char *dest) if (fd_dest < 0) { if (errno != ENOENT) @@ object-file.c: int finalize_object_file(const char *tmpfile, const char *filenam { - struct stat st; - int ret = 0; ++ unsigned retries = 0; + int ret; + +retry: @@ object-file.c: int finalize_object_file_flags(const char *tmpfile, const char *f - check_collision(tmpfile, filename)) + if (!(flags & FOF_SKIP_COLLISION_CHECK)) { + ret = check_collision(tmpfile, filename); -+ if (ret == CHECK_COLLISION_DEST_VANISHED) ++ if (ret == CHECK_COLLISION_DEST_VANISHED) { ++ if (retries++ > 5) ++ return error(_("unable to write repeatedly vanishing file %s"), ++ filename); + goto retry; ++ } + else if (ret) return -1; + } --- base-commit: 2be278337fd02495a86577a89fbf9387b2df6523 change-id: 20250103-b4-pks-object-file-racy-collision-check-a649bbf96a92